Conversation
train day sunday->tuesday
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Walkthrough주간 모델 재학습·배포 파이프라인과 Kaggle 기반 학습·배포용 스크립트·노트북 및 워크플로우를 추가했습니다. DB→Parquet 추출, Kaggle 업로드·노트북 트리거·가중치 다운로드·SSH 기반 서버 배포를 자동화합니다. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant GHA as GitHub Actions
participant Pipeline as weekly_routine.py
participant Extract as extract_to_parquet.py
participant Upload as upload_to_kaggle.py
participant Trigger as trigger_training.py
participant Kaggle as Kaggle
participant Download as download_weights.py
participant Deploy as deploy_to_server.py
participant Server as Remote Server
GHA->>Pipeline: 트리거 (cron / workflow_dispatch)
Pipeline->>Extract: 1) DB → Parquet
Extract-->>Pipeline: 완료/실패
Pipeline->>Upload: 2) Kaggle 데이터셋 업로드
Upload-->>Pipeline: 완료/실패
Pipeline->>Trigger: 3) 노트북 푸시 및 실행
Trigger->>Kaggle: 노트북 푸시 -> 실행
Kaggle-->>Trigger: 상태(완료/실패)
Trigger-->>Pipeline: 완료/실패 목록
Pipeline->>Download: 4) 가중치 다운로드
Download-->>Pipeline: 파일들/실패 로그
Pipeline->>Deploy: 5) 서버 배포
Deploy->>Server: SSH/SCP로 파일 업로드
Server-->>Deploy: 수신 확인
Deploy-->>Pipeline: 배포 결과
Pipeline-->>GHA: 전체 파이프라인 종료 상태
sequenceDiagram
autonumber
participant Notebook as Kaggle Notebook
participant Repo as Git Repo
participant Script as train_kaggle.py
participant Data as Parquet 데이터
participant Model as 학습 루프
participant Output as Kaggle Output
Notebook->>Repo: git clone sisc-web
Notebook->>Notebook: sys.path 추가 / 의존성 설치
Notebook->>Script: importlib로 train_kaggle 로드
Script->>Data: Parquet 파일 로드
Script->>Script: 전처리 및 스케일러 적합(훈련 전용)
Script->>Model: 모델 빌드 및 학습
Model->>Output: 체크포인트(.pt/.keras), 스칼러(.pkl), metadata.json 저장
Script-->>Notebook: 학습 완료 및 출력 파일 나열
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50분 Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 20
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (2)
AI/kaggle_notebooks/TCN/notebook.ipynb-60-62 (1)
60-62:⚠️ Potential issue | 🟡 Minor패키지 버전 명시 필요.
버전 미지정 시 재현성 문제가 발생할 수 있습니다. 특히
library_context에서 manifest가numpy<2.0을 요구한다고 명시되어 있으나, 현재 설치 명령은 최신 버전(2.0+)을 설치할 수 있습니다.🔧 버전 명시 제안
-"!pip install -q scikit-learn pandas numpy torch" +"!pip install -q scikit-learn pandas 'numpy<2.0' torch"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/kaggle_notebooks/TCN/notebook.ipynb` around lines 60 - 62, The pip install cell currently runs "!pip install -q scikit-learn pandas numpy torch" without pinned versions; update that cell (the pip install command string) to pin package versions to reproducible releases (e.g., ensure numpy is constrained to <2.0 as required by the library_context/manifest and pin compatible versions of scikit-learn, pandas, and torch) so the notebook installs deterministic, manifest-compatible packages.AI/scripts/deploy_to_server.py-76-77 (1)
76-77:⚠️ Potential issue | 🟡 MinorRSA 키만 지원 - Ed25519/ECDSA 키 사용 시 실패
RSAKey.from_private_key()는 RSA 키만 처리합니다. Ed25519나 ECDSA 키 사용 시 예외가 발생합니다.♻️ 다중 키 타입 지원
- # 환경변수로 받은 키 문자열을 파일처럼 사용 - private_key = paramiko.RSAKey.from_private_key(io.StringIO(SSH_KEY_STR)) + # 다양한 키 타입 지원 + key_file = io.StringIO(SSH_KEY_STR) + private_key = None + for key_class in (paramiko.Ed25519Key, paramiko.ECDSAKey, paramiko.RSAKey): + try: + key_file.seek(0) + private_key = key_class.from_private_key(key_file) + break + except paramiko.SSHException: + continue + if private_key is None: + raise ValueError("지원되지 않는 SSH 키 형식입니다.")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/deploy_to_server.py` around lines 76 - 77, The current code only calls paramiko.RSAKey.from_private_key(io.StringIO(SSH_KEY_STR)), which fails for Ed25519/ECDSA keys; update the logic that creates private_key to try parsing SSH_KEY_STR with multiple Paramiko key classes (e.g., paramiko.RSAKey.from_private_key, paramiko.ECDSAKey.from_private_key, paramiko.Ed25519Key.from_private_key, and paramiko.DSSKey.from_private_key) using io.StringIO(SSH_KEY_STR), catching exceptions and falling through until one succeeds, and raise a clear error if none parse; reference the private_key variable and the from_private_key calls to locate where to implement this fallback parsing.
🧹 Nitpick comments (9)
AI/kaggle_notebooks/TCN/notebook.ipynb (1)
33-42: git clone 실패 시 에러 처리 추가 권장.
subprocess.run()결과의returncode를 확인하지 않아 clone 실패 시 후속 셀에서 불명확한 에러가 발생할 수 있습니다.♻️ 에러 처리 추가 제안
"result = subprocess.run(\n", " ['git', 'clone', '--depth', '1',\n", " 'https://github.com/SISC-IT/sisc-web.git',\n", " '/kaggle/working/sisc-web'],\n", " capture_output=True, text=True\n", ")\n", "print(result.stdout)\n", -"print(result.stderr)" +"print(result.stderr)\n", +"if result.returncode != 0:\n", +" raise RuntimeError(f'Git clone 실패: {result.stderr}')"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/kaggle_notebooks/TCN/notebook.ipynb` around lines 33 - 42, The subprocess.run call that clones the repo doesn't check for errors; update the clone logic that uses subprocess.run and the result variable by either adding check=True to subprocess.run or explicitly inspecting result.returncode (and using result.stderr/result.stdout) after the call; if returncode != 0, log/print a clear error message including result.stderr and raise an exception (e.g., RuntimeError or re-raise CalledProcessError) so downstream cells don't proceed on a failed git clone..gitignore (1)
47-47:test_*.py전역 ignore는 범위가 너무 넓습니다.Line 47 때문에 저장소 어디서든 새 테스트 파일이 untracked 상태로 남습니다. 파이프라인/모델 테스트를 추가해도 CI에 못 올라갈 수 있으니, 임시 스크립트가 필요한 특정 디렉터리로 한정하는 편이 안전합니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore at line 47, 현재 .gitignore의 전역 패턴 "test_*.py"가 저장소 전체의 테스트 파일을 무시해 CI에 올라가지 못할 위험이 있습니다; .gitignore에서 "test_*.py" 항목을 제거하거나 범위를 좁혀 특정 디렉터리로 한정(예: "scripts/test_*.py" 또는 "dev_tools/test_*.py" 등)하도록 수정하고, 변경한 패턴이 의도한 위치에서만 무시되는지 확인한 뒤 커밋하세요; 참조 대상: .gitignore의 "test_*.py" 패턴을 찾아 교체하거나 삭제하십시오.AI/modules/signal/models/PatchTST/wrapper.py (1)
191-193:get_signals()시그니처는 기본 계약과 맞춰 두는 편이 좋습니다.지금도 호출은 되지만,
BaseSignalModel과 다른 시그니처가 되면서 override/type-checking 신뢰도가 떨어집니다. 다른 래퍼들과 동일하게ticker_id,sector_id를 명시적으로 유지해 두는 쪽이 유지보수에 안전합니다.예시 수정
- def get_signals(self, df, **kwargs): + def get_signals(self, df: pd.DataFrame, ticker_id: int = 0, sector_id: int = 0) -> Dict[str, float]: """BaseSignalModel 추상 메서드 구현 - predict()로 위임""" return self.predict(df)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/PatchTST/wrapper.py` around lines 191 - 193, Update the get_signals method to match the BaseSignalModel contract by adding the explicit ticker_id and sector_id parameters (e.g., def get_signals(self, df, ticker_id=None, sector_id=None, **kwargs)) while keeping the current behavior of delegating to predict; ensure the method accepts those parameters and forwards the dataframe to self.predict(df) (and any necessary kwargs) so overrides/type-checking remain consistent with other wrappers and BaseSignalModel.AI/modules/signal/models/itransformer/train_kaggle.py (1)
326-327: 불필요한int()호출
len()은 이미 정수를 반환합니다.- "n_train_samples": int(len(X_train)), - "n_val_samples" : int(len(X_val)), + "n_train_samples": len(X_train), + "n_val_samples" : len(X_val),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/itransformer/train_kaggle.py` around lines 326 - 327, The dict entries for "n_train_samples" and "n_val_samples" wrap len(...) with an unnecessary int() call; remove the int() so use "n_train_samples": len(X_train) and "n_val_samples": len(X_val) (locate these keys in the training metadata construction in train_kaggle.py)..github/workflows/train.yml (2)
58-63: Kaggle 자격증명 JSON 생성 시 특수문자 처리 고려시크릿 값에 따옴표나 특수문자가 포함되면 JSON이 깨지거나 셸 인젝션이 발생할 수 있습니다.
jq를 사용하면 더 안전합니다.🛡️ 개선 제안
- name: Setup Kaggle credentials run: | mkdir -p ~/.kaggle - echo '{"username":"${{ secrets.KAGGLE_USERNAME }}","key":"${{ secrets.KAGGLE_KEY }}"}' \ - > ~/.kaggle/kaggle.json + jq -n --arg user "${{ secrets.KAGGLE_USERNAME }}" --arg key "${{ secrets.KAGGLE_KEY }}" \ + '{"username": $user, "key": $key}' > ~/.kaggle/kaggle.json chmod 600 ~/.kaggle/kaggle.json🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/train.yml around lines 58 - 63, Replace the fragile echo that constructs the Kaggle JSON with a safe JSON builder: use jq (or printf piped to jq -n --arg) to create the JSON object from the GitHub secrets and write it to ~/.kaggle/kaggle.json, then set chmod 600; specifically update the "Setup Kaggle credentials" step to pass ${{ secrets.KAGGLE_USERNAME }} and ${{ secrets.KAGGLE_KEY }} as jq arguments (instead of interpolating them into a raw string) to avoid JSON corruption or shell injection.
31-31: GitHub Actions 버전 업데이트 권장Static analysis에서
actions/checkout@v3와actions/setup-python@v4가 오래된 버전이라고 경고합니다. 최신 버전으로 업데이트하세요.♻️ 권장 수정
- name: Checkout repository - uses: actions/checkout@v3 + uses: actions/checkout@v4 # ──────────────────────────────────────── # 2. Python 환경 세팅 # ──────────────────────────────────────── - name: Setup Python - uses: actions/setup-python@v4 + uses: actions/setup-python@v5Also applies to: 37-37
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/train.yml at line 31, Update the GitHub Actions steps that reference actions/checkout@v3 and actions/setup-python@v4 to their current stable releases: locate the `uses: actions/checkout@v3` and `uses: actions/setup-python@v4` lines and bump them to the latest recommended major versions (e.g., actions/checkout@v4 and actions/setup-python@v5 or the most recent stable tags), then commit the change and run a workflow lint/validation to ensure compatibility with any inputs you rely on.AI/modules/signal/models/PatchTST/train_kaggle.py (2)
185-194: 예외 처리 개선: 예외 체인 유지 및 구체적 예외 타입 사용
raise ... from e를 사용하여 원본 예외 정보를 보존하세요. 디버깅 시 근본 원인 파악에 도움됩니다.♻️ 개선 제안
try: df_t = add_technical_indicators(df_t) df_t = add_multi_timeframe_features(df_t) processed.append(df_t) - except Exception as e: + except (ValueError, KeyError, TypeError) as e: fail_count += 1 print(f"\n[경고] {ticker} 피처 계산 실패 ({fail_count}/{fail_limit}): {e}") if fail_count >= fail_limit: - raise RuntimeError(f"피처 계산 실패가 {fail_limit}개를 초과했습니다.") + raise RuntimeError(f"피처 계산 실패가 {fail_limit}개를 초과했습니다.") from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/PatchTST/train_kaggle.py` around lines 185 - 194, The except block in the try around add_technical_indicators and add_multi_timeframe_features swallows the original exception; update the handler in train_kaggle.py to re-raise a RuntimeError using "raise RuntimeError(... ) from e" so the exception chain is preserved and include the original exception message, and consider narrowing the except to specific exceptions thrown by add_technical_indicators/add_multi_timeframe_features (e.g., ValueError/TypeError) instead of a bare Exception; keep the existing fail_count/fail_limit logic and logging but ensure the re-raise uses "from e".
321-321: 불필요한 f-string 접두사플레이스홀더가 없는 f-string입니다.
- print(f"\n>> 완료") + print("\n>> 완료")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/PatchTST/train_kaggle.py` at line 321, Replace the unnecessary f-string in the final print statement: locate the print call in train_kaggle.py that currently uses print(f"\n>> 완료") and change it to a normal string print (remove the leading "f") so it reads print("\n>> 완료"), leaving the rest of the function and surrounding code unchanged.AI/pipelines/weekly_routine.py (1)
30-30: 사용되지 않는 import:traceback
traceback모듈이 import되었지만 사용되지 않습니다.♻️ 제거
import subprocess -import traceback from datetime import datetime🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/pipelines/weekly_routine.py` at line 30, Remove the unused import of the traceback module: delete the "import traceback" line in AI/pipelines/weekly_routine.py (the unused symbol is traceback) so there are no unused imports left; run linters to confirm no other unused imports remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/kaggle_notebooks/itransformer/notebook.ipynb`:
- Around line 73-90: The current logic may load and run train_kaggle.py twice
and attempts a fallback using a non-existent path; replace the exec/open(...)
path with a single importlib-based loader: check that train_script exists and if
not raise/print and exit immediately, otherwise use
importlib.util.spec_from_file_location('train_kaggle', train_script),
module_from_spec(spec), spec.loader.exec_module(module) and call module.train();
remove the exec(open(...).read()) and the erroneous else branch that reuses the
missing path so training is loaded and invoked exactly once via module.train().
In `@AI/kaggle_notebooks/patchtst/notebook.ipynb`:
- Around line 36-41: Remove the second and third notebook cells that copy/load
non-existent files and keep the first cell's clone + sys.path approach; instead
import PatchTST_Model and the feature functions directly from the cloned repo
(from AI.modules.signal.models.PatchTST.architecture import PatchTST_Model and
from AI.modules.features.legacy.technical_features import
add_technical_indicators, add_multi_timeframe_features), set any needed env vars
(e.g., PARQUET_DIR, WEIGHTS_DIR), and if you must run the training script
dynamically, load the repository's train_kaggle.py via
importlib.util.spec_from_file_location pointing to
/kaggle/working/sisc-web/AI/modules/signal/models/PatchTST/train_kaggle.py and
invoke its train() entry point instead of copying non-existent files; also
ensure kernel_sources in kernel-metadata.json is populated or left empty
consistently with this import strategy.
In `@AI/kaggle_notebooks/TCN/notebook.ipynb`:
- Around line 69-79: Replace the exec-based loading of train_script with an
importlib-based dynamic module import: when train_script exists, create a module
spec from that path (use importlib.util.spec_from_file_location and
importlib.util.module_from_spec), load the module via spec.loader.exec_module,
then call the exported train_model_kaggle function from the loaded module (guard
the call with hasattr/module attribute check and raise or log a clear error if
train_model_kaggle is missing), and remove the open(...).read() pattern so no
file descriptor is left open; keep the existing fallback print for the else
branch.
In `@AI/kaggle_notebooks/transformer/notebook.ipynb`:
- Around line 48-51: The code sets os.environ before importing or defining os
which causes a NameError; add an import os at the top of the cell (before any
use of os.environ) so the os module is available; locate the cell containing the
os.environ assignments (references to os.environ and the symbol os) and insert
import os as the first statement in that cell.
- Around line 36-39: 이 셀에서 발생하는 NameError/ModuleNotFoundError를 고치려면 먼저 shutil를
import하고 기술 지표 모듈을 올바른 경로로 가져오세요: 현재 코드의 from technical_features import
add_technical_indicators, add_multi_timeframe_features 대신 이미 사용중인
AI.modules.features.legacy.technical_features에서 add_technical_indicators와
add_multi_timeframe_features를 import하거나 기존 import(예:
AI.modules.features.legacy.technical_features)를 재사용하도록 수정하여 모듈 경로를 정확히 참조하고
shutil.copy 호출 전에 shutil이 import되었는지 확인하세요.
In `@AI/modules/signal/core/artifact_paths.py`:
- Around line 104-110: The resolver is returning lowercase "patchtst" paths
while training/saving uses "PatchTST", causing mismatched directories on
case-sensitive filesystems; update the handling in
AI/modules/signal/core/artifact_paths.py so that when normalized_model ==
"patchtst" you resolve to the same cased directory and filenames used by the
trainer (use "PatchTST" for resolved_model_dir and the matching artifact
filenames), i.e. change the resolved_model_dir, model_path and scaler_path
construction in the ModelArtifactPaths return so they reference "PatchTST" and
the exact saved filenames produced by AI/modules/signal/models/PatchTST/train.py
(use the same casing and basename as the saver).
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Around line 160-171: The scaler is being refit per ticker inside the for
ticker, group in df.groupby("ticker") loop (using scaler.fit_transform when
fit_scaler=True), causing leakage and inconsistent statistics; instead, if
fit_scaler is True, call scaler.fit once on the full dataset (e.g., on
df[available_feats].values.astype(np.float32)) before entering the loop, then
inside the loop only call scaler.transform on feat_vals; if fit_scaler is False,
ensure scaler is already fitted and continue using scaler.transform for
feat_vals (reference symbols: df.groupby("ticker"), available_feats, feat_vals,
fit_scaler, scaler.fit_transform, scaler.transform).
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py`:
- Around line 201-203: best_state is currently assigned a shallow reference to
model.state_dict(), so it gets mutated during later training and ends up
containing the final weights; replace assignments that save the best model
(where best_state = model.state_dict()) with a deep copy: import copy and set
best_state = copy.deepcopy(model.state_dict()) so best_state holds an
independent snapshot of parameters at the best validation loss.
In `@AI/modules/signal/models/TCN/wrapper.py`:
- Around line 217-233: The batch-prep loop uses get_standard_training_data()
directly which bypasses the same preprocessing used by predict(); update the
loop over ticker_data_map in the TCN wrapper to call
_prepare_dataframe(df.copy()) (or the same helper used by predict()) to perform
required column validation and inf/NaN cleaning before extracting
self.feature_columns, then proceed with seq_len checks, scaling
(self.scaler.transform) and appending to tensor_list and valid_tickers; ensure
you still handle exceptions and preserve the existing behavior around seq_len,
feature_columns, scaler, tensor_list and valid_tickers so batch and single-shot
prediction use identical preprocessing.
In `@AI/modules/signal/models/transformer/transformer_train_kaggle.py`:
- Around line 71-73: The saved artifact names in transformer_train_kaggle.py
('model_name' and 'scaler_name') use unsuffixed filenames which will not match
the resolver conventions in AI/modules/signal/core/artifact_paths.py (which
expects names like multi_horizon_model_prod.keras /
multi_horizon_scaler_prod.pkl or *_test for tests); update the 'model_name' and
'scaler_name' values to include the proper suffix (e.g., append _prod) so they
match the resolver lookup scheme and ensure the stored files can be found during
download/deploy.
- Around line 149-167: The current shuffle-based train_test_split after calling
loader.create_dataset causes temporal leakage; instead, perform a chronological
split on full_df before creating sequences so the scaler is fit only on training
data and validation windows are strictly later. Concretely: split full_df by a
date cutoff into train_df and val_df (no random shuffle), call
loader.create_dataset(train_df, ...) to produce X_ts_train, X_ticker_train,
X_sector_train, y_train and fit the scaler there, then call
loader.create_dataset(val_df, ..., scaler=trained_scaler or with a flag to only
transform) to produce X_ts_val, X_tick_val, X_sec_val, y_val; remove the later
train_test_split(...) shuffle=True usage and ensure horizons / n_outputs logic
remains unchanged.
In `@AI/pipelines/weekly_routine.py`:
- Around line 155-172: The script currently ignores run_weekly_pipeline()'s
result so the process always exits 0; capture its return or exception in the
__main__ block and propagate failure by exiting with a non-zero code: call
result = run_weekly_pipeline(...), then if result indicates failure (e.g., False
or a non-zero status) call sys.exit(1) or re-raise the caught exception;
alternatively wrap run_weekly_pipeline(...) in try/except and on exception log
the error and call sys.exit(1). Ensure you import sys if needed and reference
the existing run_weekly_pipeline call in the __main__ block.
In `@AI/scripts/deploy_to_server.py`:
- Around line 89-112: The deploy_model function currently always returns True
even if files are missing or transfer errors occur; update deploy_model to track
failures by creating a failed list/local flag: for each fname in model['files']
check os.path.exists(local_path) and append fname to failed if missing, wrap
scp.put(local_path, remote_path) in a try/except to append fname on exception,
and at the end return False if failed is non-empty (and optionally log which
filenames failed) otherwise return True; refer to deploy_model, model['files'],
local_path, remote_path, os.path.exists, scp.put and the printed messages when
implementing.
- Around line 73-74: Replace the unsafe AutoAddPolicy usage by enforcing host
key verification: instead of
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()), set
ssh.set_missing_host_key_policy(paramiko.RejectPolicy()) and load the known host
key from a secret (e.g. SSH_HOST_KEY) into the client before connecting; read
SSH_HOST_KEY from the environment/CI secret, parse it and add it with
ssh.get_host_keys().add(hostname, key_type, key_data) (or use
SSHClient.load_host_keys when writing the secret to a temp known_hosts file) so
the code around SSHClient() and set_missing_host_key_policy(...) performs strict
verification.
In `@AI/scripts/download_weights.py`:
- Around line 30-40: The download script currently can return success without
validating downloaded files, risking stale weights being reused; update the
logic in download_weights.py (referencing MODELS) to download each model into a
temporary directory first, verify presence and exact names of the expected files
(the same filenames deploy_to_server.py expects), only after validation
atomically replace/rsync into the model's dst_dir (empty/backup the dst_dir
before move), and fail/return False if any expected file is missing or names
differ; apply the same change to the code block around lines 61-80 that handles
additional models.
In `@AI/scripts/extract_to_parquet.py`:
- Around line 178-184: The SELECT in read_sql_safe currently omits two features
used downstream (correlation_spike and surprise_cpi), causing a mismatch with
AI/modules/signal/models/itransformer/train_kaggle.py; fix by adding
correlation_spike and surprise_cpi to the SELECT projection from
macroeconomic_indicators in extract_to_parquet.py (ensure read_sql_safe and
resulting DataFrame include these columns so they persist into the output
parquet).
- Around line 123-132: The retry loop leaks DB connections because conn.close()
is only called on success; update the loop around get_conn() and
pd.read_sql(query, conn) (inside the retry in extract_to_parquet.py) to always
close the connection on each attempt — either by using a try/finally that calls
conn.close() if conn is not None or by acquiring the connection with a context
manager — so every opened connection from get_conn() is closed even when
pd.read_sql throws.
In `@AI/scripts/trigger_training.py`:
- Around line 34-44: NOTEBOOKS entries use project_root joined to
"kaggle_notebooks/..." but the repo layout places notebooks under
AI/kaggle_notebooks, so update the NOTEBOOKS constant (the dicts for "PatchTST"
and "Transformer") to include the missing "AI" path segment when building
notebook_dir (e.g., include "AI" as an extra path component in os.path.join for
each notebook_dir) so trigger_notebook() can locate the folders.
- Around line 94-117: In the status-polling loop (the code that calls
subprocess.run(["kaggle", "kernels", "status", slug], ...) inside the
trigger_training flow), check result.returncode immediately after subprocess.run
and treat any non-zero return code as a hard failure: log/print the error
details (result.stderr or result.stdout), return False, and stop polling instead
of sleeping and retrying; keep the existing checks for
"complete"/"error"/"running" only when returncode == 0 to avoid unnecessary
5-minute retries (follow the same returncode handling pattern used in
trigger_notebook()).
In `@AI/scripts/upload_to_kaggle.py`:
- Around line 34-39: The subprocess.run call in upload_to_kaggle.py references
an undefined variable KAGGLE_DATA_DIR causing a NameError; change that argument
to use the existing OUTPUT_DIR (or rename OUTPUT_DIR to KAGGLE_DATA_DIR
consistently) so the call in subprocess.run([... "-p", KAGGLE_DATA_DIR, ...])
uses the defined upload directory; update any other occurrences to keep the
variable name consistent and ensure the kaggle upload command runs.
---
Minor comments:
In `@AI/kaggle_notebooks/TCN/notebook.ipynb`:
- Around line 60-62: The pip install cell currently runs "!pip install -q
scikit-learn pandas numpy torch" without pinned versions; update that cell (the
pip install command string) to pin package versions to reproducible releases
(e.g., ensure numpy is constrained to <2.0 as required by the
library_context/manifest and pin compatible versions of scikit-learn, pandas,
and torch) so the notebook installs deterministic, manifest-compatible packages.
In `@AI/scripts/deploy_to_server.py`:
- Around line 76-77: The current code only calls
paramiko.RSAKey.from_private_key(io.StringIO(SSH_KEY_STR)), which fails for
Ed25519/ECDSA keys; update the logic that creates private_key to try parsing
SSH_KEY_STR with multiple Paramiko key classes (e.g.,
paramiko.RSAKey.from_private_key, paramiko.ECDSAKey.from_private_key,
paramiko.Ed25519Key.from_private_key, and paramiko.DSSKey.from_private_key)
using io.StringIO(SSH_KEY_STR), catching exceptions and falling through until
one succeeds, and raise a clear error if none parse; reference the private_key
variable and the from_private_key calls to locate where to implement this
fallback parsing.
---
Nitpick comments:
In @.github/workflows/train.yml:
- Around line 58-63: Replace the fragile echo that constructs the Kaggle JSON
with a safe JSON builder: use jq (or printf piped to jq -n --arg) to create the
JSON object from the GitHub secrets and write it to ~/.kaggle/kaggle.json, then
set chmod 600; specifically update the "Setup Kaggle credentials" step to pass
${{ secrets.KAGGLE_USERNAME }} and ${{ secrets.KAGGLE_KEY }} as jq arguments
(instead of interpolating them into a raw string) to avoid JSON corruption or
shell injection.
- Line 31: Update the GitHub Actions steps that reference actions/checkout@v3
and actions/setup-python@v4 to their current stable releases: locate the `uses:
actions/checkout@v3` and `uses: actions/setup-python@v4` lines and bump them to
the latest recommended major versions (e.g., actions/checkout@v4 and
actions/setup-python@v5 or the most recent stable tags), then commit the change
and run a workflow lint/validation to ensure compatibility with any inputs you
rely on.
In @.gitignore:
- Line 47: 현재 .gitignore의 전역 패턴 "test_*.py"가 저장소 전체의 테스트 파일을 무시해 CI에 올라가지 못할 위험이
있습니다; .gitignore에서 "test_*.py" 항목을 제거하거나 범위를 좁혀 특정 디렉터리로 한정(예:
"scripts/test_*.py" 또는 "dev_tools/test_*.py" 등)하도록 수정하고, 변경한 패턴이 의도한 위치에서만 무시되는지
확인한 뒤 커밋하세요; 참조 대상: .gitignore의 "test_*.py" 패턴을 찾아 교체하거나 삭제하십시오.
In `@AI/kaggle_notebooks/TCN/notebook.ipynb`:
- Around line 33-42: The subprocess.run call that clones the repo doesn't check
for errors; update the clone logic that uses subprocess.run and the result
variable by either adding check=True to subprocess.run or explicitly inspecting
result.returncode (and using result.stderr/result.stdout) after the call; if
returncode != 0, log/print a clear error message including result.stderr and
raise an exception (e.g., RuntimeError or re-raise CalledProcessError) so
downstream cells don't proceed on a failed git clone.
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Around line 326-327: The dict entries for "n_train_samples" and
"n_val_samples" wrap len(...) with an unnecessary int() call; remove the int()
so use "n_train_samples": len(X_train) and "n_val_samples": len(X_val) (locate
these keys in the training metadata construction in train_kaggle.py).
In `@AI/modules/signal/models/PatchTST/train_kaggle.py`:
- Around line 185-194: The except block in the try around
add_technical_indicators and add_multi_timeframe_features swallows the original
exception; update the handler in train_kaggle.py to re-raise a RuntimeError
using "raise RuntimeError(... ) from e" so the exception chain is preserved and
include the original exception message, and consider narrowing the except to
specific exceptions thrown by
add_technical_indicators/add_multi_timeframe_features (e.g.,
ValueError/TypeError) instead of a bare Exception; keep the existing
fail_count/fail_limit logic and logging but ensure the re-raise uses "from e".
- Line 321: Replace the unnecessary f-string in the final print statement:
locate the print call in train_kaggle.py that currently uses print(f"\n>> 완료")
and change it to a normal string print (remove the leading "f") so it reads
print("\n>> 완료"), leaving the rest of the function and surrounding code
unchanged.
In `@AI/modules/signal/models/PatchTST/wrapper.py`:
- Around line 191-193: Update the get_signals method to match the
BaseSignalModel contract by adding the explicit ticker_id and sector_id
parameters (e.g., def get_signals(self, df, ticker_id=None, sector_id=None,
**kwargs)) while keeping the current behavior of delegating to predict; ensure
the method accepts those parameters and forwards the dataframe to
self.predict(df) (and any necessary kwargs) so overrides/type-checking remain
consistent with other wrappers and BaseSignalModel.
In `@AI/pipelines/weekly_routine.py`:
- Line 30: Remove the unused import of the traceback module: delete the "import
traceback" line in AI/pipelines/weekly_routine.py (the unused symbol is
traceback) so there are no unused imports left; run linters to confirm no other
unused imports remain.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 583ecab4-5362-4c58-9617-171cf32089f0
📒 Files selected for processing (23)
.github/workflows/train.yml.gitignoreAI/kaggle_notebooks/TCN/kernel-metadata.jsonAI/kaggle_notebooks/TCN/notebook.ipynbAI/kaggle_notebooks/itransformer/kernel-metadata.jsonAI/kaggle_notebooks/itransformer/notebook.ipynbAI/kaggle_notebooks/patchtst/kernel-metadata.jsonAI/kaggle_notebooks/patchtst/notebook.ipynbAI/kaggle_notebooks/transformer/kernel-metadata.jsonAI/kaggle_notebooks/transformer/notebook.ipynbAI/modules/signal/core/artifact_paths.pyAI/modules/signal/models/PatchTST/train_kaggle.pyAI/modules/signal/models/PatchTST/wrapper.pyAI/modules/signal/models/TCN/tcn_train_kaggle.pyAI/modules/signal/models/TCN/wrapper.pyAI/modules/signal/models/itransformer/train_kaggle.pyAI/modules/signal/models/transformer/transformer_train_kaggle.pyAI/pipelines/weekly_routine.pyAI/scripts/deploy_to_server.pyAI/scripts/download_weights.pyAI/scripts/extract_to_parquet.pyAI/scripts/trigger_training.pyAI/scripts/upload_to_kaggle.py
| "shutil.copy('/kaggle/working/transformer_architecture.py', '/kaggle/working/architecture.py')\n", | ||
| "\n", | ||
| "from technical_features import add_technical_indicators, add_multi_timeframe_features\n", | ||
| "print('>> import 완료')" |
There was a problem hiding this comment.
이 셀은 그대로는 NameError/ModuleNotFoundError로 멈춥니다.
shutil를 import하지 않았고, Line 38의 technical_features는 첫 셀에서 추가한 repo 경로 기준 top-level 모듈도 아닙니다. 이미 가져온 AI.modules.features.legacy.technical_features를 재사용하거나 여기서 정확한 모듈 경로로 import하세요.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/kaggle_notebooks/transformer/notebook.ipynb` around lines 36 - 39, 이 셀에서
발생하는 NameError/ModuleNotFoundError를 고치려면 먼저 shutil를 import하고 기술 지표 모듈을 올바른 경로로
가져오세요: 현재 코드의 from technical_features import add_technical_indicators,
add_multi_timeframe_features 대신 이미 사용중인
AI.modules.features.legacy.technical_features에서 add_technical_indicators와
add_multi_timeframe_features를 import하거나 기존 import(예:
AI.modules.features.legacy.technical_features)를 재사용하도록 수정하여 모듈 경로를 정확히 참조하고
shutil.copy 호출 전에 shutil이 import되었는지 확인하세요.
| df = read_sql_safe(""" | ||
| SELECT date, cpi, gdp, interest_rate, unemployment_rate, | ||
| us10y, us2y, yield_spread, vix_close, dxy_close, | ||
| wti_price, gold_price, credit_spread_hy | ||
| FROM macroeconomic_indicators | ||
| ORDER BY date | ||
| """, "macroeconomic_indicators") |
There was a problem hiding this comment.
거시 지표 추출 컬럼이 downstream 학습 스크립트와 맞지 않습니다.
AI/modules/signal/models/itransformer/train_kaggle.py:115-120은 correlation_spike와 surprise_cpi까지 feature 후보로 봅니다. 여기서 둘 다 빼면 Kaggle 학습에서는 조용히 누락된 채로 학습돼서, 기존 파이프라인과 feature set이 달라집니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/scripts/extract_to_parquet.py` around lines 178 - 184, The SELECT in
read_sql_safe currently omits two features used downstream (correlation_spike
and surprise_cpi), causing a mismatch with
AI/modules/signal/models/itransformer/train_kaggle.py; fix by adding
correlation_spike and surprise_cpi to the SELECT projection from
macroeconomic_indicators in extract_to_parquet.py (ensure read_sql_safe and
resulting DataFrame include these columns so they persist into the output
parquet).
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
.github/workflows/train.yml (1)
35-44: 의존성 버전 고정 및 캐싱 권장.의존성 버전이 고정되지 않아 빌드 재현성 문제가 발생할 수 있습니다. 또한
setup-python액션의 캐싱 기능을 활용하면 빌드 속도를 개선할 수 있습니다.♻️ 권장 수정
requirements-train.txt파일을 생성하여 버전을 고정하거나, 최소한 주요 버전을 명시하는 것을 권장합니다:- name: Setup Python uses: actions/setup-python@v5 with: python-version: '3.10' + cache: 'pip' - name: Install dependencies run: | pip install \ - kaggle \ - pandas \ - pyarrow \ - psycopg2-binary \ - paramiko \ - sshtunnel \ - scp + kaggle>=1.6.0 \ + pandas>=2.0.0 \ + pyarrow>=14.0.0 \ + psycopg2-binary>=2.9.0 \ + paramiko>=3.0.0 \ + sshtunnel>=0.4.0 \ + scp>=0.14.0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/train.yml around lines 35 - 44, The Install dependencies step currently runs pip install with unpinned packages; create a requirements-train.txt listing the exact package versions (or minimum major versions) for kaggle, pandas, pyarrow, psycopg2-binary, paramiko, sshtunnel, scp and update the "Install dependencies" step to pip install -r requirements-train.txt; also enable caching by using the setup-python/cache or actions/cache keyed on the Python version and requirements-train.txt (refer to the existing setup-python action and the "Install dependencies" step) to speed builds and improve reproducibility.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/train.yml:
- Around line 27-33: Update the GitHub Actions steps that reference
actions/checkout@v3 and actions/setup-python@v4 to their current supported
releases (e.g., actions/checkout@v4 and actions/setup-python@v5 or the latest
stable major versions); locate the steps that use the strings
"actions/checkout@v3" and "actions/setup-python@v4" in the workflow and replace
them with the updated action versions, then run a workflow lint or dry-run to
verify compatibility.
- Around line 46-51: Replace the fragile echo JSON creation in the "Setup Kaggle
credentials" step with a jq-based safe generator: capture the secrets
($KAGGLE_USERNAME and $KAGGLE_KEY) into environment variables and run something
like jq -n --arg u "$KAGGLE_USERNAME" --arg k "$KAGGLE_KEY"
'{username:$u,key:$k}' > ~/.kaggle/kaggle.json, then chmod 600
~/.kaggle/kaggle.json; this ensures special characters in
KAGGLE_KEY/KAGGLE_USERNAME are properly escaped and produces valid JSON.
---
Nitpick comments:
In @.github/workflows/train.yml:
- Around line 35-44: The Install dependencies step currently runs pip install
with unpinned packages; create a requirements-train.txt listing the exact
package versions (or minimum major versions) for kaggle, pandas, pyarrow,
psycopg2-binary, paramiko, sshtunnel, scp and update the "Install dependencies"
step to pip install -r requirements-train.txt; also enable caching by using the
setup-python/cache or actions/cache keyed on the Python version and
requirements-train.txt (refer to the existing setup-python action and the
"Install dependencies" step) to speed builds and improve reproducibility.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6b1f2880-10f3-48f4-85bb-c96eb2f1744e
📒 Files selected for processing (1)
.github/workflows/train.yml
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (1)
AI/scripts/trigger_training.py (1)
92-98:⚠️ Potential issue | 🟠 Major상태 조회 오류는 즉시 실패 처리하세요.
kaggle kernels status가 non-zero를 반환해도 지금은stdout만 보고 5분씩 재시도합니다. 인증/slug/네트워크 문제면 최대 12시간 동안 불필요하게 대기하므로returncode를 먼저 검사하고 바로False를 반환하는 편이 안전합니다.예시 수정
while check_count < max_checks: result = subprocess.run( ["kaggle", "kernels", "status", slug], capture_output=True, text=True ) + if result.returncode != 0: + print(f" [{notebook['name']}] 상태 조회 실패! ❌") + print((result.stderr or result.stdout).strip()) + return False output = result.stdout.lower()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/trigger_training.py` around lines 92 - 98, The code currently inspects result.stdout regardless of process success; change the logic after the subprocess.run(...) call to first check result.returncode and immediately return False (and log an error including returncode/stderr) if it's non-zero, instead of continuing into the 5-minute retry loop; update the logic around the subprocess.run(...) invocation and the subsequent use of result/output to bail out on failed `kaggle kernels status` calls.
🧹 Nitpick comments (1)
AI/scripts/trigger_training.py (1)
65-73: 커널 ID의 단일 소스를 두세요.
kernels push -p는 각notebook_dir의kernel-metadata.json을 기준으로 대상을 정하지만, 성공 URL과 상태 조회는KAGGLE_USERNAME으로 다시 조합하고 있습니다. owner가 바뀌거나 metadata와 env가 어긋나면 push는 성공해도 polling은 다른 커널을 보게 되니, metadata의id를 읽어 그대로 재사용하는 편이 안전합니다.Also applies to: 81-88
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/trigger_training.py` around lines 65 - 73, The current success URL and later polling reconstruct the kernel identifier from KAGGLE_USERNAME and notebook['slug'], which can diverge from the pushed kernel; instead, after running subprocess.run(...) to push (the kernels push -p using notebook['notebook_dir']), open and parse the kernel-metadata.json in notebook['notebook_dir'] to read the canonical kernel "id" and use that id for both the success message in the block that prints notebook['name'] and for any subsequent polling logic (the later block around lines 81-88), replacing usage of KAGGLE_USERNAME/notebook['slug'] with the metadata id to ensure a single authoritative source for the kernel identifier.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Around line 130-132: The global df.ffill() after merging price_df and macro_df
causes feature leakage across tickers; change the fill strategy to forward-fill
macro columns only within each ticker group and then fill remaining NaNs with 0.
Locate the merged dataframe operations around df = pd.merge(price_df, macro_df,
on="date", how="left"), df = df.sort_values(["ticker",
"date"]).reset_index(drop=True) and replace the df.ffill().fillna(0) step with a
grouped forward-fill for macro columns (e.g., using
df.groupby("ticker")[macro_cols].ffill()) followed by df.fillna(0) to handle
other columns.
- Around line 49-51: The save filenames in train_kaggle.py ("model_name",
"scaler_name", "metadata_name") use the new itransformer_* names but downstream
code (artifact_paths.py, itransformer/wrapper.py, deploy_to_server.py) still
expects the old multi_horizon_* names (e.g., multi_horizon_model.keras,
multi_horizon_scaler.pkl); pick one convention and make it consistent: either
change the keys in train_kaggle.py back to
"model_name":"multi_horizon_model.keras" and
"scaler_name":"multi_horizon_scaler.pkl" (and "metadata_name" if applicable) or
update artifact_paths.py, itransformer/wrapper.py (around the loader at
wrapper.py:339) and deploy_to_server.py to look for itransformer_model.keras /
itransformer_scaler.pkl / itransformer_metadata.json; ensure all references to
model_name/scaler_name/metadata_name across these modules match exactly.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py`:
- Line 260: Remove the unnecessary f-string used in the print statement by
replacing the literal print(f"\n>> 완료") with a plain string print("\n>> 완료");
locate the exact statement print(f"\n>> 완료") in tcn_train_kaggle.py (inside the
TCN training/finishing block) and remove the leading f to satisfy Ruff F541
linting.
- Around line 269-270: 현재 parser.add_argument("--end-date",
default="2023-12-31")로 고정돼 있어 자동 재학습이 최신 데이터를 반영하지 못합니다;
parser.add_argument("--end-date", ...)의 기본값을 하드코딩된 날짜에서 동적값으로 바꿔 오늘
날짜(datetime.date.today().isoformat()) 또는 파켓 파일에서 읽어온 최대 날짜(max_date)를 기본값으로
사용하도록 변경하고, argparse 처리 로직에서 --end-date가 주어지지 않았을 때만 이 동적값을 사용하도록 구현하십시오 (예:
compute max_date_from_parquet() 후에 default_end = max(today, max_date)로 설정하거나,
argparse 이후 if args.end_date is None: args.end_date = default_end 방식으로 처리).
- Around line 119-141: The loop that calls add_technical_indicators silently
skips up to 19 failures and then later pd.concat(processed) can raise "No
objects to concatenate" without context; change the feature-calculation loop
(the for ticker in tqdm(...) block) to log each failing ticker and its exception
(include the ticker value and exception text) when add_technical_indicators
raises, keep fail_count but fail faster or at least include a list of
failed_tickers, and after the loop check if processed is empty (or number of
successful tickers is zero) and raise a clear RuntimeError that includes
start_date/end_date, total tickers attempted, fail_count, and the failed_tickers
list before attempting to create full_df; ensure references to processed,
fail_count, failed_tickers, add_technical_indicators, and full_df are updated
accordingly.
- Line 268: The default parquet path set in the
parser.add_argument("--parquet-dir", default=os.environ.get('PARQUET_DIR',
'/kaggle/input/datasets/jihyeongkimm/sisc-ai-trading-dataset')) is invalid for
Kaggle; update the logic so if PARQUET_DIR is not set the code either: 1) uses a
safe default root '/kaggle/input' (so users must provide the specific dataset
slug), or 2) dynamically detects the mounted dataset by scanning /kaggle/input
(e.g., os.walk('/kaggle/input') and picking the matching dataset folder) and
sets the --parquet-dir fallback accordingly; change the code around the
--parquet-dir argument handling to prefer os.environ['PARQUET_DIR'] if present,
otherwise perform the detection or fallback to '/kaggle/input'.
- Around line 155-160: The current split uses "split_date =
dates[split_date_idx]" and then train_df = raw_df[raw_df['date'] <= split_date]
which includes the split_date in the training set and can leave val_df empty;
change the split to make train use dates strictly before the split_date and val
use >= split_date (i.e., use raw_df[raw_df['date'] < split_date] for train_df
and raw_df[raw_df['date'] >= split_date] for val_df) so that the chosen
split_date becomes the first date in the validation set; update references to
dates, split_date_idx, split_date, train_df, and val_df accordingly.
---
Duplicate comments:
In `@AI/scripts/trigger_training.py`:
- Around line 92-98: The code currently inspects result.stdout regardless of
process success; change the logic after the subprocess.run(...) call to first
check result.returncode and immediately return False (and log an error including
returncode/stderr) if it's non-zero, instead of continuing into the 5-minute
retry loop; update the logic around the subprocess.run(...) invocation and the
subsequent use of result/output to bail out on failed `kaggle kernels status`
calls.
---
Nitpick comments:
In `@AI/scripts/trigger_training.py`:
- Around line 65-73: The current success URL and later polling reconstruct the
kernel identifier from KAGGLE_USERNAME and notebook['slug'], which can diverge
from the pushed kernel; instead, after running subprocess.run(...) to push (the
kernels push -p using notebook['notebook_dir']), open and parse the
kernel-metadata.json in notebook['notebook_dir'] to read the canonical kernel
"id" and use that id for both the success message in the block that prints
notebook['name'] and for any subsequent polling logic (the later block around
lines 81-88), replacing usage of KAGGLE_USERNAME/notebook['slug'] with the
metadata id to ensure a single authoritative source for the kernel identifier.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 88b44e3d-2441-4d9c-be85-f620b46bfdfa
📒 Files selected for processing (5)
AI/modules/signal/core/artifact_paths.pyAI/modules/signal/models/TCN/tcn_train_kaggle.pyAI/modules/signal/models/itransformer/train_kaggle.pyAI/scripts/trigger_training.pyAI/scripts/upload_to_kaggle.py
🚧 Files skipped from review as they are similar to previous changes (1)
- AI/modules/signal/core/artifact_paths.py
| raw_df = raw_df[ | ||
| (raw_df['date'] >= start_date) & | ||
| (raw_df['date'] <= end_date) | ||
| ].copy() | ||
|
|
||
| print(f">> 로드 완료: {len(raw_df):,}행, {raw_df['ticker'].nunique()}개 종목") | ||
|
|
||
| # 피처 계산 | ||
| print(">> 피처 계산 중...") | ||
| processed = [] | ||
| fail_count = 0 | ||
|
|
||
| for ticker in tqdm(raw_df['ticker'].unique(), desc="피처 계산"): | ||
| df_t = raw_df[raw_df['ticker'] == ticker].copy() | ||
| try: | ||
| df_t = add_technical_indicators(df_t) | ||
| processed.append(df_t) | ||
| except Exception as e: | ||
| fail_count += 1 | ||
| if fail_count >= 20: | ||
| raise RuntimeError("피처 계산 실패가 20개를 초과했습니다.") | ||
|
|
||
| full_df = pd.concat(processed).reset_index(drop=True) |
There was a problem hiding this comment.
전처리 실패가 조용히 누락되면 부분 학습이나 모호한 실패로 끝납니다.
현재 구현은 add_technical_indicators() 실패를 최대 19개까지 그냥 건너뛰어서, 버그와 데이터 이슈를 구분하지 못한 채 일부 종목만으로 학습이 진행될 수 있습니다. 게다가 날짜 필터 결과가 비었거나 전부 실패하면 Line 141에서 No objects to concatenate만 남아 원인이 사라집니다.
🛠️ 제안 수정
raw_df = raw_df[
- (raw_df['date'] >= start_date) &
- (raw_df['date'] <= end_date)
+ (raw_df["date"] >= start_date) &
+ (raw_df["date"] <= end_date)
].copy()
+ if raw_df.empty:
+ raise ValueError("지정한 날짜 범위에 해당하는 데이터가 없습니다.")
@@
- except Exception as e:
+ except Exception as err:
fail_count += 1
+ print(f">> 피처 계산 실패: {ticker} ({fail_count}개) - {err}")
if fail_count >= 20:
- raise RuntimeError("피처 계산 실패가 20개를 초과했습니다.")
+ raise RuntimeError("피처 계산 실패가 20개를 초과했습니다.") from err
- full_df = pd.concat(processed).reset_index(drop=True)
+ if not processed:
+ raise ValueError("피처 계산에 성공한 종목이 없습니다.")
+ full_df = pd.concat(processed, ignore_index=True)🧰 Tools
🪛 Ruff (0.15.7)
[warning] 136-136: Do not catch blind exception: Exception
(BLE001)
[warning] 139-139: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` around lines 119 - 141, The
loop that calls add_technical_indicators silently skips up to 19 failures and
then later pd.concat(processed) can raise "No objects to concatenate" without
context; change the feature-calculation loop (the for ticker in tqdm(...) block)
to log each failing ticker and its exception (include the ticker value and
exception text) when add_technical_indicators raises, keep fail_count but fail
faster or at least include a list of failed_tickers, and after the loop check if
processed is empty (or number of successful tickers is zero) and raise a clear
RuntimeError that includes start_date/end_date, total tickers attempted,
fail_count, and the failed_tickers list before attempting to create full_df;
ensure references to processed, fail_count, failed_tickers,
add_technical_indicators, and full_df are updated accordingly.
| dates = raw_df['date'].sort_values().unique() | ||
| split_date_idx = int(len(dates) * 0.8) | ||
| split_date = dates[split_date_idx] | ||
|
|
||
| train_df = raw_df[raw_df['date'] <= split_date].copy() | ||
| val_df = raw_df[raw_df['date'] > split_date].copy() |
There was a problem hiding this comment.
Train/Val 분할이 하루 밀려서 검증셋이 비어질 수 있습니다.
split_date = dates[int(len(dates) * 0.8)] 다음에 <= / >로 나누면 split date가 train 쪽에 포함됩니다. 예를 들어 unique date가 5개면 int(5 * 0.8) == 4라서 Line 160의 val이 바로 빈 DataFrame이 됩니다.
🛠️ 제안 수정
dates = raw_df['date'].sort_values().unique()
split_date_idx = int(len(dates) * 0.8)
- split_date = dates[split_date_idx]
+ if split_date_idx <= 0 or split_date_idx >= len(dates):
+ raise ValueError("Train/Val 분할을 위해 최소 2개 이상의 거래일이 필요합니다.")
+ train_dates = dates[:split_date_idx]
+ val_dates = dates[split_date_idx:]
- train_df = raw_df[raw_df['date'] <= split_date].copy()
- val_df = raw_df[raw_df['date'] > split_date].copy()
- print(f">> Train: ~{split_date}, Val: {split_date}~")
+ train_df = raw_df[raw_df['date'].isin(train_dates)].copy()
+ val_df = raw_df[raw_df['date'].isin(val_dates)].copy()
+ print(f">> Train: {train_dates[0]}~{train_dates[-1]}, Val: {val_dates[0]}~{val_dates[-1]}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` around lines 155 - 160, The
current split uses "split_date = dates[split_date_idx]" and then train_df =
raw_df[raw_df['date'] <= split_date] which includes the split_date in the
training set and can leave val_df empty; change the split to make train use
dates strictly before the split_date and val use >= split_date (i.e., use
raw_df[raw_df['date'] < split_date] for train_df and raw_df[raw_df['date'] >=
split_date] for val_df) so that the chosen split_date becomes the first date in
the validation set; update references to dates, split_date_idx, split_date,
train_df, and val_df accordingly.
| with open(metadata_path, "w", encoding="utf-8") as f: | ||
| json.dump(metadata, f, ensure_ascii=False, indent=2) | ||
|
|
||
| print(f"\n>> 완료") |
There was a problem hiding this comment.
불필요한 f prefix 때문에 Ruff F541가 납니다.
placeholder가 없는 문자열이라 f가 의미가 없습니다. lint gate가 걸려 있으면 여기서 CI가 막힙니다.
🛠️ 제안 수정
- print(f"\n>> 완료")
+ print("\n>> 완료")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print(f"\n>> 완료") | |
| print("\n>> 완료") |
🧰 Tools
🪛 Ruff (0.15.7)
[error] 260-260: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` at line 260, Remove the
unnecessary f-string used in the print statement by replacing the literal
print(f"\n>> 완료") with a plain string print("\n>> 완료"); locate the exact
statement print(f"\n>> 완료") in tcn_train_kaggle.py (inside the TCN
training/finishing block) and remove the leading f to satisfy Ruff F541 linting.
|
|
||
| def parse_args() -> argparse.Namespace: | ||
| parser = argparse.ArgumentParser(description="Train TCN signal model (Kaggle/Actions 버전)") | ||
| parser.add_argument("--parquet-dir", default=os.environ.get('PARQUET_DIR', '/kaggle/input/datasets/jihyeongkimm/sisc-ai-trading-dataset')) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Kaggle Notebook에서 dataset을 attached 했을 때 기본 mount 경로는 무엇인가요? /kaggle/input/datasets//도 유효한 런타임 경로인가요, 아니면 기본값을 /kaggle/input/로 둬야 하나요?
💡 Result:
Kaggle Notebook에서 dataset을 attached 했을 때 기본 mount 경로는 /kaggle/input/ 입니다. 각 attached dataset은 /kaggle/input/ 아래에 dataset의 slug (또는 이름, owner/slug 형식)에 해당하는 디렉토리로 마운트됩니다. 예를 들어, "aviation-accident-database-synopses" dataset의 경우 /kaggle/input/aviation-accident-database-synopses/AviationData.csv 입니다. /kaggle/input/datasets// 형식은 표준이 아니며 유효하지 않습니다. 기본값은 /kaggle/input/ 또는 /kaggle/input/ 형식으로, 실제 경로는 os.walk('/kaggle/input')이나 notebook의 Data 탭에서 확인하는 것이 안전합니다. /kaggle/input/만 사용하는 것은 이름 충돌 가능성 때문에 권장되지 않습니다.
Citations:
- 1: https://www.kaggle.com/docs/notebooks
- 2: https://www.kaggle.com/code/dansbecker/finding-your-files-in-kaggle-kernels
- 3: https://www.kaggle.com/discussions/getting-started/25930
- 4: https://www.kaggle.com/discussions/product-feedback/31738
Kaggle 기본 입력 경로 수정 필요
현재 기본값 /kaggle/input/datasets/jihyeongkimm/sisc-ai-trading-dataset은 Kaggle의 표준 마운트 경로가 아닙니다. Kaggle Notebook에서 attached dataset은 /kaggle/input/<dataset-slug> 또는 /kaggle/input/<owner-dataset-slug> 형식으로만 마운트됩니다. /kaggle/input/datasets/ 경로는 유효하지 않으므로, PARQUET_DIR 환경 변수가 설정되지 않으면 스크립트가 실패합니다.
실제 마운트된 경로는 데이터셋의 정확한 slug를 확인하여 올바른 기본값으로 수정하거나, 또는 os.walk('/kaggle/input')로 실시간 확인하도록 변경하세요.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` at line 268, The default
parquet path set in the parser.add_argument("--parquet-dir",
default=os.environ.get('PARQUET_DIR',
'/kaggle/input/datasets/jihyeongkimm/sisc-ai-trading-dataset')) is invalid for
Kaggle; update the logic so if PARQUET_DIR is not set the code either: 1) uses a
safe default root '/kaggle/input' (so users must provide the specific dataset
slug), or 2) dynamically detects the mounted dataset by scanning /kaggle/input
(e.g., os.walk('/kaggle/input') and picking the matching dataset folder) and
sets the --parquet-dir fallback accordingly; change the code around the
--parquet-dir argument handling to prefer os.environ['PARQUET_DIR'] if present,
otherwise perform the detection or fallback to '/kaggle/input'.
| parser.add_argument("--start-date", default="2015-01-01") | ||
| parser.add_argument("--end-date", default="2023-12-31") |
There was a problem hiding this comment.
기본 end-date가 2023-12-31로 고정돼 있어 자동 재학습이 최신 데이터를 못 봅니다.
오늘 날짜가 2026-03-27인데 현재 기본값이면 2024-01-01 이후 데이터는 override가 없는 모든 실행에서 계속 제외됩니다. 주간 파이프라인 용도라면 실행일이나 parquet의 최대 날짜를 기본값으로 잡는 편이 안전합니다.
🛠️ 제안 수정
parser.add_argument("--start-date", default="2015-01-01")
- parser.add_argument("--end-date", default="2023-12-31")
+ parser.add_argument(
+ "--end-date",
+ default=os.environ.get("END_DATE", pd.Timestamp.today().strftime("%Y-%m-%d")),
+ )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/tcn_train_kaggle.py` around lines 269 - 270, 현재
parser.add_argument("--end-date", default="2023-12-31")로 고정돼 있어 자동 재학습이 최신 데이터를
반영하지 못합니다; parser.add_argument("--end-date", ...)의 기본값을 하드코딩된 날짜에서 동적값으로 바꿔 오늘
날짜(datetime.date.today().isoformat()) 또는 파켓 파일에서 읽어온 최대 날짜(max_date)를 기본값으로
사용하도록 변경하고, argparse 처리 로직에서 --end-date가 주어지지 않았을 때만 이 동적값을 사용하도록 구현하십시오 (예:
compute max_date_from_parquet() 후에 default_end = max(today, max_date)로 설정하거나,
argparse 이후 if args.end_date is None: args.end_date = default_end 방식으로 처리).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/modules/signal/models/PatchTST/train.py`:
- Around line 217-219: The current split forces n_val = max(1, int(len(tickers)
* 0.2)) which makes train_tickers empty when len(tickers) == 1 and leads to
divide-by-zero later; change the logic around tickers, n_val and val_tickers so
n_val is at most len(tickers)-1 (e.g. n_val = min(max(1, int(len(tickers) *
0.2)), max(0, len(tickers) - 1))), and if len(tickers) < 2 either raise a clear
error or handle the single-ticker case explicitly (ensure train_tickers is
non-empty before proceeding and adjust downstream averaging logic in the
training loop that uses train_tickers/val_tickers).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3914cdca-d1bf-467b-adcb-47a133d02b6a
📒 Files selected for processing (1)
AI/modules/signal/models/PatchTST/train.py
주석 변경 머지한 모델들 주석처리 해제 upload to kaggle import 개선
There was a problem hiding this comment.
Actionable comments posted: 14
♻️ Duplicate comments (7)
AI/modules/signal/models/TCN/train_kaggle.py (1)
287-288:⚠️ Potential issue | 🟠 Major기본
--end-date고정값 때문에 최신 데이터 재학습이 멈춥니다.오늘 기준(2026-03-29)으로
2023-12-31고정값은 자동 파이프라인의 최신 데이터 반영 목표와 충돌합니다.수정 예시
- parser.add_argument("--end-date", default="2023-12-31") + parser.add_argument( + "--end-date", + default=os.environ.get("END_DATE", pd.Timestamp.today().strftime("%Y-%m-%d")), + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/TCN/train_kaggle.py` around lines 287 - 288, The CLI currently hardcodes parser.add_argument("--end-date", default="2023-12-31") which prevents automatic retrain on newest data; change the default to a dynamic value (e.g., compute today's date at runtime or use None meaning "up to today") and update any downstream code that reads the --end-date argument (the parser usage sites) to handle the new sentinel by converting None to datetime.date.today() or formatting today's date string; ensure you import datetime where needed and update any tests or invocations that expect the old fixed date.AI/modules/signal/models/transformer/train_kaggle.py (1)
82-83:⚠️ Potential issue | 🟠 Major아티팩트 파일명이 배포 resolver 규약과 불일치할 가능성이 큽니다.
현재 이름(
multi_horizon_model.keras,multi_horizon_scaler.pkl)은_prod규약과 어긋날 수 있어 다운로드/배포 단계에서 탐색 실패 위험이 있습니다.수정 예시
- 'model_name' : 'multi_horizon_model.keras', - 'scaler_name' : 'multi_horizon_scaler.pkl', + 'model_name' : 'multi_horizon_model_prod.keras', + 'scaler_name' : 'multi_horizon_scaler_prod.pkl',🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/transformer/train_kaggle.py` around lines 82 - 83, The artifact filenames in the config dict (the 'model_name' and 'scaler_name' entries in train_kaggle.py) don't follow the deployment resolver naming convention and may fail during download; update those entries to match the resolver pattern (e.g., append the required suffix such as "_prod" before the file extension: change "multi_horizon_model.keras" → "multi_horizon_model_prod.keras" and "multi_horizon_scaler.pkl" → "multi_horizon_scaler_prod.pkl"), or alternatively generate names via the project's artifact-resolver helper (use the existing helper/constant that produces resolver-compliant names) so both 'model_name' and 'scaler_name' always follow the deployment naming convention.AI/scripts/deploy_to_server.py (2)
103-118:⚠️ Potential issue | 🟠 Major
deploy_model이 부분 실패를 성공으로 처리합니다.Line 103-118에서 누락 파일을 스킵해도 마지막에 항상
True를 반환합니다. 전송 실패/누락을 수집해서 하나라도 있으면False반환으로 바꿔야 파이프라인이 올바르게 실패합니다.수정 예시
def deploy_model(ssh: paramiko.SSHClient, scp: SCPClient, model: dict) -> bool: @@ + failed = [] for fname in model['files']: local_path = os.path.join(model['local_dir'], fname) @@ if not os.path.exists(local_path): print(f" [경고] 파일 없음 (스킵): {local_path}") + failed.append(fname) continue @@ - scp.put(local_path, remote_path) + try: + scp.put(local_path, remote_path) + except Exception as e: + print(f" [오류] 전송 실패: {fname} ({e})") + failed.append(fname) + continue print(f" 전송 완료: {remote_path}") - print(f" [{model['name']}] 배포 완료! ✅") - return True + if failed: + print(f" [{model['name']}] 배포 실패: {failed} ❌") + return False + print(f" [{model['name']}] 배포 완료! ✅") + return True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/deploy_to_server.py` around lines 103 - 118, The current deploy_model loop always returns True even if files were missing or transfer failed; modify deploy_model to track failures by introducing a boolean or list (e.g., had_failures or failed_files) and mark it when a local file is missing or when scp.put raises/returns an error (wrap scp.put in try/except and record failures on exceptions or non-success results), and at the end return False if any failures were recorded, otherwise return True; reference the for loop iterating model['files'], the local_path/remote_path handling, and the scp.put call to locate where to add the failure collection and the final conditional return.
78-78:⚠️ Potential issue | 🟠 Major호스트 키 검증을 비활성화한 상태입니다.
Line 78의
AutoAddPolicy()는 운영 배포 파이프라인에서 보안상 위험합니다.RejectPolicy()+ 사전 등록된 known_hosts(또는 시크릿 주입)로 강제 검증해 주세요.수정 예시
def create_ssh_client() -> paramiko.SSHClient: """SSH 연결 생성 (키 문자열로 직접 연결)""" ssh = paramiko.SSHClient() - ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) + ssh.load_system_host_keys() + ssh.set_missing_host_key_policy(paramiko.RejectPolicy())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/deploy_to_server.py` at line 78, Replace the insecure AutoAddPolicy usage by enforcing host-key verification: in the SSH client setup (SSHClient instance where ssh.set_missing_host_key_policy is called) remove paramiko.AutoAddPolicy() and instead call ssh.load_system_host_keys() or ssh.load_host_keys(<path_to_injected_known_hosts>) and set ssh.set_missing_host_key_policy(paramiko.RejectPolicy()); also add clear error handling in the connection flow (e.g., catch paramiko.SSHException/BadHostKeyException) to fail the deploy when the host key is not recognized so the pipeline uses pre-registered known_hosts or injected secret.AI/pipelines/weekly_routine.py (1)
169-172:⚠️ Potential issue | 🔴 Critical메인 엔트리포인트가 실패를 종료 코드로 전파하지 않습니다.
Line 169-172에서
run_weekly_pipeline()반환값을 무시하고 있어, 실패해도 프로세스가 성공(0)으로 종료될 수 있습니다.수정 예시
- run_weekly_pipeline( + success = run_weekly_pipeline( skip_extract = args.skip_extract, skip_upload = args.skip_upload, ) + sys.exit(0 if success else 1)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/pipelines/weekly_routine.py` around lines 169 - 172, The call to run_weekly_pipeline is ignoring its return value so failures won't set a non‑zero process exit status; change the main invocation to capture the function's return (e.g., ret = run_weekly_pipeline(...)) and propagate it via sys.exit(ret) (or sys.exit(1) when the function indicates failure) or re-raise/raise SystemExit based on run_weekly_pipeline's contract; update the invocation site around run_weekly_pipeline to exit with the proper code instead of discarding the result.AI/scripts/extract_to_parquet.py (1)
195-202:⚠️ Potential issue | 🟠 Major거시 지표 추출 컬럼이 iTransformer 학습 피처와 불일치합니다.
Line 195-202의 SELECT에
correlation_spike,surprise_cpi가 없어 downstream(AI/modules/signal/models/itransformer/train_kaggle.py)에서 기대하는 피처가 누락됩니다.수정 예시
df = read_sql_safe(""" SELECT date, cpi, gdp, interest_rate, unemployment_rate, us10y, us2y, yield_spread, vix_close, dxy_close, wti_price, gold_price, credit_spread_hy, - core_cpi, pce, core_pce + core_cpi, pce, core_pce, + correlation_spike, surprise_cpi FROM macroeconomic_indicators ORDER BY date """, "macroeconomic_indicators")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/extract_to_parquet.py` around lines 195 - 202, The SELECT used to build df in extract_to_parquet.py is missing the features expected by the iTransformer training pipeline—add correlation_spike and surprise_cpi to the SELECT list so df contains those columns (the df = read_sql_safe(""" ... """, "macroeconomic_indicators") block); confirm the column names match the database schema and include them alongside the existing fields (date, cpi, gdp, etc.) so downstream AI/modules/signal/models/itransformer/train_kaggle.py receives the expected features.AI/scripts/download_weights.py (1)
30-51:⚠️ Potential issue | 🟠 Major다운로드 성공 판정 전에 기대 산출물 검증이 필요합니다.
현재는 Line 60-79에서 기존 디렉터리에 바로 다운로드하고
returncode==0만으로 성공 처리합니다. 이 경우 누락/파일명 변경이 있어도 이전 주차 파일이 남아 있으면 성공으로 오판할 수 있습니다.deploy_to_server.py가 기대하는 파일명을 기준으로 임시 디렉터리 검증 후 교체해 주세요.수정 예시
+import shutil +import tempfile ... MODELS = [ { "name" : "PatchTST", "slug" : f"{KAGGLE_USERNAME}/patchtst-training", "dst_dir": os.path.join(project_root, "AI/data/weights/PatchTST"), + "files" : ["patchtst_model.pt", "patchtst_scaler.pkl"], }, ... ] def download_weights(model: dict) -> bool: @@ - os.makedirs(model['dst_dir'], exist_ok=True) - - result = subprocess.run( - [ - "kaggle", "kernels", "output", - model['slug'], - "-p", model['dst_dir'] - ], - capture_output=True, - text=True - ) + expected = set(model["files"]) + with tempfile.TemporaryDirectory(prefix=f"{model['name']}_") as tmp_dir: + result = subprocess.run( + ["kaggle", "kernels", "output", model["slug"], "-p", tmp_dir], + capture_output=True, + text=True, + ) - if result.returncode == 0: + if result.returncode != 0: + print(f" [{model['name']}] 다운로드 실패! ❌") + print(result.stderr) + return False + + downloaded = { + f for f in os.listdir(tmp_dir) + if os.path.isfile(os.path.join(tmp_dir, f)) + } + missing = expected - downloaded + if missing: + print(f" [{model['name']}] 누락 파일: {sorted(missing)}") + return False + + os.makedirs(model["dst_dir"], exist_ok=True) + for f in expected: + shutil.move(os.path.join(tmp_dir, f), os.path.join(model["dst_dir"], f)) + print(f" [{model['name']}] 다운로드 완료! ✅") ... return True - else: - print(f" [{model['name']}] 다운로드 실패! ❌") - print(result.stderr) - return FalseAlso applies to: 60-79
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/download_weights.py` around lines 30 - 51, The download currently trusts returncode==0 and writes directly into the destination dirs defined in MODELS, which can mask missing/renamed files; instead download into a temporary directory per model, verify the presence of the exact expected filenames that deploy_to_server.py requires, and only atomically replace the model's dst_dir with the temp directory if verification succeeds. Concretely: add or reference an expected_files list per model (tied to MODELS["name"] or extend each dict with "expected_files"), create a tmp dir (tempfile.mkdtemp()) and download the artifacts there, check that every file in expected_files exists (and non-empty) in the temp dir, on success remove or backup the existing dst_dir and move/rename the temp dir to dst_dir (os.replace / shutil.move) to ensure atomic swap, and if verification fails delete the temp dir and raise/return a non-zero failure so callers know the download didn't produce the expected outputs (also update the download function to use these symbols: MODELS, dst_dir, and expected_files, and make deploy_to_server.py's expected filenames the source of truth).
🧹 Nitpick comments (1)
AI/modules/signal/models/itransformer/train_kaggle.py (1)
170-170: 사용하지 않는 루프 변수는_ticker로 명시해 의도를 드러내는 편이 좋습니다.가독성 차원의 소소한 정리 포인트입니다.
수정 예시
- for ticker, group in df.groupby("ticker"): + for _ticker, group in df.groupby("ticker"):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/itransformer/train_kaggle.py` at line 170, The loop variable "ticker" in the for statement "for ticker, group in df.groupby(\"ticker\"):" is unused; rename it to "_ticker" to indicate it's intentionally ignored (keep "group" as-is) and update any references if they exist elsewhere in the surrounding function (train_kaggle.py) to avoid unused-variable warnings and improve readability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/kaggle_notebooks/itransformer/notebook.ipynb`:
- Around line 32-36: The notebook currently hardcodes the PR branch in the
subprocess.run git clone call, which will break if that branch is removed;
change this to use a branch variable sourced from an environment variable with a
default of "main": import os and set something like branch =
os.getenv("PR_BRANCH", "main"), then replace the hardcoded
'20260326-#343-AI-kaggle-pipeline' token in the argument list passed to
subprocess.run with that branch variable so the clone uses the env override if
provided and "main" otherwise; update the subprocess.run call that builds the
list of git args to reference the branch variable instead of the literal string.
In `@AI/kaggle_notebooks/patchtst/notebook.ipynb`:
- Around line 18-25: 현재 subprocess.run(...)에서 브랜치명이 하드코딩된 git clone 명령(브랜치
'20260326-#343-AI-kaggle-pipeline')으로 인해 브랜치 변경 시 파이프라인이 깨집니다; 변경 방법은
subprocess.run 호출에서 사용되는 브랜치 문자열을 환경변수로 대체하되 기본값을 'main'으로 유지하도록 하세요 (예: BRANCH
= os.environ.get('CLONE_BRANCH', 'main')를 읽어 git clone 인자 리스트의 --branch 값으로 사용),
관련 위치는 subprocess.run 호출과 그 파라미터 문자열을 찾아 수정하고 stdout/stderr 출력
로직(print(result.stdout), print(result.stderr))은 그대로 두세요.
In `@AI/kaggle_notebooks/TCN/notebook.ipynb`:
- Around line 19-23: The notebook currently hardcodes a PR-specific branch in
the subprocess.run git clone call (the branch string
'20260326-#343-AI-kaggle-pipeline'); change this to use a default branch (e.g.,
'main') and allow overriding via an environment variable (read from os.environ,
e.g., GIT_CLONE_BRANCH) before building the arguments passed to subprocess.run
so the clone uses os.environ.get('GIT_CLONE_BRANCH', 'main') instead of the
fixed branch name.
In `@AI/kaggle_notebooks/transformer/notebook.ipynb`:
- Around line 31-35: The git clone call hardcodes a short-lived PR branch;
change the '--branch' argument in the subprocess.run call so it defaults to
'main' and can be overridden via an environment variable (e.g., read
os.environ.get('CLONE_BRANCH', 'main') and insert that into the args list passed
to subprocess.run). Update the list of arguments used in the subprocess.run
invocation (the one containing '--branch', '20260326-#343-AI-kaggle-pipeline')
to use the env-provided branch value instead of the hardcoded PR branch.
In `@AI/modules/signal/models/PatchTST/train_kaggle.py`:
- Around line 239-252: The code doesn't guard against empty sequences from
build_sequences (X_train/X_val), which will crash DataLoader creation or later
loss/avg computations; before creating train_loader/val_loader in
train_kaggle.py (after build_sequences) check that X_train and X_val are
non-empty (e.g., len(X_train) > 0 and len(X_val) > 0 or X_train.shape[0] > 0),
and if either is empty either raise a clear error/log and exit or return early
(skip training) with a descriptive message; ensure the same check is applied at
the other spot mentioned (the block around the later DataLoader creation at
~299-310) so TensorDataset/torch.DataLoader and subsequent average-loss
computations never receive empty tensors.
- Around line 109-113: The code currently only prints missing features (computed
via FEATURE_COLUMNS, available, missing) but must fail fast to avoid later
shape/runtime errors when creating the model with enc_in=len(FEATURE_COLUMNS);
change the check so that if missing is non-empty you raise an explicit error
(e.g., ValueError) or exit with non-zero status including the missing names and
context (full_df columns and expected FEATURE_COLUMNS) so execution stops
immediately instead of proceeding to model creation.
- Around line 47-49: The config currently hardcodes 'end_date' to '2023-12-31',
which excludes all data after 2023; update the code in
AI/modules/signal/models/PatchTST/train_kaggle.py to compute 'end_date'
dynamically (e.g., using today's date) instead of the fixed string so daily runs
include up-to-current data; locate the config entry with the 'end_date' key and
replace the literal with a computed value using the current date (via
datetime.date.today() or datetime.datetime.utcnow().date()) formatted as
'YYYY-MM-DD'.
In `@AI/modules/signal/models/TCN/train_kaggle.py`:
- Around line 161-168: 날짜 기반 분할 후 train_df 또는 val_df가 빈 데이터프레임이 되어
scaler.fit(...)가 실패하는 경계 케이스를 방지하려면 split_date 계산 직후와 train_df/val_df 생성 직후에 유효성
검사를 추가하세요: variables to check are dates, split_date_idx, split_date, train_df,
val_df; if train_df.empty or val_df.empty, adjust split_date by moving
split_date_idx one step (e.g., decrement or increment safely within bounds of
dates) until both sides are non-empty or fall back to a row-based split, and
update train_df/val_df accordingly; if adjustment is impossible (very few unique
dates), raise a clear exception that includes dates and split_date_idx so caller
can handle it; apply the same validation/fix for the other split occurrence
around the scaler.fit(...) call mentioned (lines ~172-174).
In `@AI/modules/signal/models/transformer/train_kaggle.py`:
- Around line 155-159: The train/val split currently builds separate ID maps
causing inconsistent embedding indices; instead build ticker_to_id and
sector_to_id once from the single full_df and pass those same dicts into
train_single_pipeline so both train and val use identical mappings; locate where
tickers/sectors and ticker_to_id/sector_to_id are created and move that logic to
run once (using full_df.unique as shown) and ensure train_single_pipeline
signature/caller uses the shared ticker_to_id and sector_to_id (also update the
places noted around lines 227-228 to accept the shared maps).
- Around line 131-143: The code currently calls full_df = pd.concat(processed)
which will raise a confusing error when processed is empty; modify the loop that
iterates over raw_df['ticker'].unique() to collect failed tickers (e.g.,
failed_tickers list) inside the except block where fail_count is incremented,
and before calling pd.concat(processed) check if processed is empty; if so,
raise a RuntimeError with a clear message that includes fail_count and the
failed_tickers (and optionally the first exception message) instead of letting
pd.concat fail silently.
- Around line 216-223: The current split can leave train empty when tickers has
length 1; add a validation on tickers (the variable computed from
full_df['ticker'].unique()) to ensure at least 2 unique tickers before computing
n_val/train_tickers/val_tickers, and if not, raise a clear ValueError
instructing the user to provide more tickers (or alternatively handle the
single-ticker case by setting n_val=0 and val_tickers empty); update the logic
around n_val, val_tickers, train_tickers, train_df and val_df accordingly so
scaler fitting later (on train_df) never receives an empty DataFrame.
In `@AI/scripts/deploy_to_server.py`:
- Line 101: The remote mkdir call injects model['remote_dir'] directly into
ssh.exec_command which is unsafe for paths containing spaces or shell
metacharacters; change the code that calls ssh.exec_command(f"mkdir -p
{model['remote_dir']}") to either (A) shell-escape or quote the path using
shlex.quote(model['remote_dir']) before interpolating, or (B) avoid the shell
entirely and use the SSH library's SFTP API (e.g., paramiko.SFTPClient.mkdir or
sftp.makedirs) to create the directory; update the location where
ssh.exec_command is invoked and ensure any alternate approach handles parent
dirs and errors robustly.
In `@AI/scripts/trigger_training.py`:
- Around line 35-38: Update the notebook_dir path string to match the actual
lowercase folder name so the existence check succeeds: change the value passed
to os.path.join(project_root, "AI/kaggle_notebooks/patchtst") where the dict for
"name": "PatchTST" / "slug": "patchtst-training" is defined (this is the entry
used later in the folder existence check around Line 61); ensure the path uses
"patchtst" (all lowercase) to match the filesystem.
In `@AI/scripts/upload_to_kaggle.py`:
- Around line 24-27: Before attempting upload, validate that the dataset
metadata file exists and its "id" matches the expected target constructed from
KAGGLE_USERNAME and DATASET_SLUG: read and parse dataset-metadata.json, confirm
the "id" field equals f"{KAGGLE_USERNAME}/{DATASET_SLUG}", and if missing or
mismatched log an error via the same logger and abort (exit non-zero). Update
places that print the upload target (where OUTPUT_DIR, KAGGLE_USERNAME,
DATASET_SLUG are referenced and in the upload routine around lines ~58-67) to
show the actual id from dataset-metadata.json when proceeding. Ensure failures
handle FileNotFoundError/JSON parse errors with clear error messages and do not
attempt to call the Kaggle upload step when validation fails.
---
Duplicate comments:
In `@AI/modules/signal/models/TCN/train_kaggle.py`:
- Around line 287-288: The CLI currently hardcodes
parser.add_argument("--end-date", default="2023-12-31") which prevents automatic
retrain on newest data; change the default to a dynamic value (e.g., compute
today's date at runtime or use None meaning "up to today") and update any
downstream code that reads the --end-date argument (the parser usage sites) to
handle the new sentinel by converting None to datetime.date.today() or
formatting today's date string; ensure you import datetime where needed and
update any tests or invocations that expect the old fixed date.
In `@AI/modules/signal/models/transformer/train_kaggle.py`:
- Around line 82-83: The artifact filenames in the config dict (the 'model_name'
and 'scaler_name' entries in train_kaggle.py) don't follow the deployment
resolver naming convention and may fail during download; update those entries to
match the resolver pattern (e.g., append the required suffix such as "_prod"
before the file extension: change "multi_horizon_model.keras" →
"multi_horizon_model_prod.keras" and "multi_horizon_scaler.pkl" →
"multi_horizon_scaler_prod.pkl"), or alternatively generate names via the
project's artifact-resolver helper (use the existing helper/constant that
produces resolver-compliant names) so both 'model_name' and 'scaler_name' always
follow the deployment naming convention.
In `@AI/pipelines/weekly_routine.py`:
- Around line 169-172: The call to run_weekly_pipeline is ignoring its return
value so failures won't set a non‑zero process exit status; change the main
invocation to capture the function's return (e.g., ret =
run_weekly_pipeline(...)) and propagate it via sys.exit(ret) (or sys.exit(1)
when the function indicates failure) or re-raise/raise SystemExit based on
run_weekly_pipeline's contract; update the invocation site around
run_weekly_pipeline to exit with the proper code instead of discarding the
result.
In `@AI/scripts/deploy_to_server.py`:
- Around line 103-118: The current deploy_model loop always returns True even if
files were missing or transfer failed; modify deploy_model to track failures by
introducing a boolean or list (e.g., had_failures or failed_files) and mark it
when a local file is missing or when scp.put raises/returns an error (wrap
scp.put in try/except and record failures on exceptions or non-success results),
and at the end return False if any failures were recorded, otherwise return
True; reference the for loop iterating model['files'], the
local_path/remote_path handling, and the scp.put call to locate where to add the
failure collection and the final conditional return.
- Line 78: Replace the insecure AutoAddPolicy usage by enforcing host-key
verification: in the SSH client setup (SSHClient instance where
ssh.set_missing_host_key_policy is called) remove paramiko.AutoAddPolicy() and
instead call ssh.load_system_host_keys() or
ssh.load_host_keys(<path_to_injected_known_hosts>) and set
ssh.set_missing_host_key_policy(paramiko.RejectPolicy()); also add clear error
handling in the connection flow (e.g., catch
paramiko.SSHException/BadHostKeyException) to fail the deploy when the host key
is not recognized so the pipeline uses pre-registered known_hosts or injected
secret.
In `@AI/scripts/download_weights.py`:
- Around line 30-51: The download currently trusts returncode==0 and writes
directly into the destination dirs defined in MODELS, which can mask
missing/renamed files; instead download into a temporary directory per model,
verify the presence of the exact expected filenames that deploy_to_server.py
requires, and only atomically replace the model's dst_dir with the temp
directory if verification succeeds. Concretely: add or reference an
expected_files list per model (tied to MODELS["name"] or extend each dict with
"expected_files"), create a tmp dir (tempfile.mkdtemp()) and download the
artifacts there, check that every file in expected_files exists (and non-empty)
in the temp dir, on success remove or backup the existing dst_dir and
move/rename the temp dir to dst_dir (os.replace / shutil.move) to ensure atomic
swap, and if verification fails delete the temp dir and raise/return a non-zero
failure so callers know the download didn't produce the expected outputs (also
update the download function to use these symbols: MODELS, dst_dir, and
expected_files, and make deploy_to_server.py's expected filenames the source of
truth).
In `@AI/scripts/extract_to_parquet.py`:
- Around line 195-202: The SELECT used to build df in extract_to_parquet.py is
missing the features expected by the iTransformer training pipeline—add
correlation_spike and surprise_cpi to the SELECT list so df contains those
columns (the df = read_sql_safe(""" ... """, "macroeconomic_indicators") block);
confirm the column names match the database schema and include them alongside
the existing fields (date, cpi, gdp, etc.) so downstream
AI/modules/signal/models/itransformer/train_kaggle.py receives the expected
features.
---
Nitpick comments:
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Line 170: The loop variable "ticker" in the for statement "for ticker, group
in df.groupby(\"ticker\"):" is unused; rename it to "_ticker" to indicate it's
intentionally ignored (keep "group" as-is) and update any references if they
exist elsewhere in the surrounding function (train_kaggle.py) to avoid
unused-variable warnings and improve readability.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f22849b3-101b-48f2-8d63-bcc15d4786c4
📒 Files selected for processing (20)
.github/workflows/train.ymlAI/kaggle_notebooks/TCN/kernel-metadata.jsonAI/kaggle_notebooks/TCN/notebook.ipynbAI/kaggle_notebooks/itransformer/kernel-metadata.jsonAI/kaggle_notebooks/itransformer/notebook.ipynbAI/kaggle_notebooks/patchtst/kernel-metadata.jsonAI/kaggle_notebooks/patchtst/notebook.ipynbAI/kaggle_notebooks/transformer/kernel-metadata.jsonAI/kaggle_notebooks/transformer/notebook.ipynbAI/modules/signal/models/PatchTST/train.pyAI/modules/signal/models/PatchTST/train_kaggle.pyAI/modules/signal/models/TCN/train_kaggle.pyAI/modules/signal/models/itransformer/train_kaggle.pyAI/modules/signal/models/transformer/train_kaggle.pyAI/pipelines/weekly_routine.pyAI/scripts/deploy_to_server.pyAI/scripts/download_weights.pyAI/scripts/extract_to_parquet.pyAI/scripts/trigger_training.pyAI/scripts/upload_to_kaggle.py
✅ Files skipped from review due to trivial changes (4)
- AI/kaggle_notebooks/itransformer/kernel-metadata.json
- AI/kaggle_notebooks/patchtst/kernel-metadata.json
- .github/workflows/train.yml
- AI/kaggle_notebooks/transformer/kernel-metadata.json
🚧 Files skipped from review as they are similar to previous changes (2)
- AI/kaggle_notebooks/TCN/kernel-metadata.json
- AI/modules/signal/models/PatchTST/train.py
| "import subprocess\n", | ||
| "result = subprocess.run(\n", | ||
| " ['git', 'clone', '--depth', '1', '--branch', '20260326-#343-AI-kaggle-pipeline', 'https://github.com/SISC-IT/sisc-web.git', '/kaggle/working/sisc-web'],\n", | ||
| " capture_output=True, text=True\n", | ||
| ")\n", |
There was a problem hiding this comment.
PR 브랜치 하드코딩은 장기 운영에서 불안정합니다.
브랜치가 사라지면 이 노트북은 바로 실패합니다. main 기본값 + 환경변수 오버라이드로 전환해 주세요.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/kaggle_notebooks/itransformer/notebook.ipynb` around lines 32 - 36, The
notebook currently hardcodes the PR branch in the subprocess.run git clone call,
which will break if that branch is removed; change this to use a branch variable
sourced from an environment variable with a default of "main": import os and set
something like branch = os.getenv("PR_BRANCH", "main"), then replace the
hardcoded '20260326-#343-AI-kaggle-pipeline' token in the argument list passed
to subprocess.run with that branch variable so the clone uses the env override
if provided and "main" otherwise; update the subprocess.run call that builds the
list of git args to reference the branch variable instead of the literal string.
There was a problem hiding this comment.
캐글 환경에서 코드 학습하려고 붙여놓은거라 노트북 학습 다 완료하면 제거하고 업뎃 할 예정
There was a problem hiding this comment.
@vmgfh878-art, 네, 이해했습니다! 현재 PR 기간 중 한시적으로 사용하는 노트북이군요. 학습 완료 후 main 브랜치 기준으로 정리하실 계획이라면 문제없습니다. 혹시 나중에 잊지 않도록 follow-up 이슈를 생성해 드릴까요?
| "import subprocess\n", | ||
| "result = subprocess.run(\n", | ||
| " ['git', 'clone', '--depth', '1', '--branch', '20260326-#343-AI-kaggle-pipeline', 'https://github.com/SISC-IT/sisc-web.git', '/kaggle/working/sisc-web'],\n", | ||
| " capture_output=True, text=True\n", | ||
| ")\n", |
There was a problem hiding this comment.
노트북 실행 브랜치를 PR 전용 이름으로 고정하지 마세요.
브랜치 삭제/변경 시 Kaggle 학습이 바로 실패합니다. 기본 main + 환경변수 override 방식으로 바꾸는 게 안전합니다.
🧰 Tools
🪛 Ruff (0.15.7)
[error] 20-20: Starting a process with a partial executable path
(S607)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/kaggle_notebooks/TCN/notebook.ipynb` around lines 19 - 23, The notebook
currently hardcodes a PR-specific branch in the subprocess.run git clone call
(the branch string '20260326-#343-AI-kaggle-pipeline'); change this to use a
default branch (e.g., 'main') and allow overriding via an environment variable
(read from os.environ, e.g., GIT_CLONE_BRANCH) before building the arguments
passed to subprocess.run so the clone uses os.environ.get('GIT_CLONE_BRANCH',
'main') instead of the fixed branch name.
| "import subprocess\n", | ||
| "result = subprocess.run(\n", | ||
| " ['git', 'clone', '--depth', '1', '--branch', '20260326-#343-AI-kaggle-pipeline', 'https://github.com/SISC-IT/sisc-web.git', '/kaggle/working/sisc-web'],\n", | ||
| " capture_output=True, text=True\n", | ||
| ")\n", |
There was a problem hiding this comment.
클론 대상 브랜치를 고정 PR 브랜치로 두지 마세요.
브랜치 만료/삭제 시 노트북이 즉시 실패합니다. 기본 main으로 두고 필요 시 env로 브랜치를 주입하는 방식이 안정적입니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/kaggle_notebooks/transformer/notebook.ipynb` around lines 31 - 35, The git
clone call hardcodes a short-lived PR branch; change the '--branch' argument in
the subprocess.run call so it defaults to 'main' and can be overridden via an
environment variable (e.g., read os.environ.get('CLONE_BRANCH', 'main') and
insert that into the args list passed to subprocess.run). Update the list of
arguments used in the subprocess.run invocation (the one containing '--branch',
'20260326-#343-AI-kaggle-pipeline') to use the env-provided branch value instead
of the hardcoded PR branch.
| 'start_date' : '2015-01-01', | ||
| 'end_date' : '2023-12-31', # 미래 데이터 차단 (Look-ahead bias 방지) | ||
| 'seq_len' : 120, |
There was a problem hiding this comment.
주간 자동 재학습 목적과 다르게 end_date가 고정되어 있습니다.
오늘 기준(2026-03-29)으로 2023-12-31 고정값이면 2024년 이후 데이터가 기본 실행에서 계속 제외됩니다.
수정 예시
- 'end_date' : '2023-12-31', # 미래 데이터 차단 (Look-ahead bias 방지)
+ 'end_date' : os.environ.get('END_DATE', pd.Timestamp.today().strftime('%Y-%m-%d')),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 'start_date' : '2015-01-01', | |
| 'end_date' : '2023-12-31', # 미래 데이터 차단 (Look-ahead bias 방지) | |
| 'seq_len' : 120, | |
| 'start_date' : '2015-01-01', | |
| 'end_date' : os.environ.get('END_DATE', pd.Timestamp.today().strftime('%Y-%m-%d')), | |
| 'seq_len' : 120, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/PatchTST/train_kaggle.py` around lines 47 - 49, The
config currently hardcodes 'end_date' to '2023-12-31', which excludes all data
after 2023; update the code in AI/modules/signal/models/PatchTST/train_kaggle.py
to compute 'end_date' dynamically (e.g., using today's date) instead of the
fixed string so daily runs include up-to-current data; locate the config entry
with the 'end_date' key and replace the literal with a computed value using the
current date (via datetime.date.today() or datetime.datetime.utcnow().date())
formatted as 'YYYY-MM-DD'.
| tickers = sorted(full_df['ticker'].unique()) | ||
| sectors = sorted(full_df['sector'].unique() if 'sector' in full_df.columns else ['Unknown']) | ||
| ticker_to_id = {t: i for i, t in enumerate(tickers)} | ||
| sector_to_id = {s: i for i, s in enumerate(sectors)} | ||
|
|
There was a problem hiding this comment.
train/val에서 ticker·sector ID를 따로 매핑하면 임베딩 의미가 뒤틀립니다.
현재는 train과 val이 서로 다른 ID 사전을 쓰므로, val의 ticker_id=0이 train의 ticker_id=0과 다른 종목일 수 있습니다. 임베딩 입력 의미가 깨져 검증 지표 신뢰도가 낮아집니다.
수정 방향
-def build_sequences_transformer(full_df, scaler, fit_scaler=True):
+def build_sequences_transformer(full_df, scaler, ticker_to_id, sector_to_id, fit_scaler=True):
@@
- tickers = sorted(full_df['ticker'].unique())
- sectors = sorted(full_df['sector'].unique() if 'sector' in full_df.columns else ['Unknown'])
- ticker_to_id = {t: i for i, t in enumerate(tickers)}
- sector_to_id = {s: i for i, s in enumerate(sectors)}
+ tickers = list(ticker_to_id.keys())
+ sectors = list(sector_to_id.keys())그리고 train_single_pipeline()에서 full_df 기준으로 매핑을 1회 생성해서 train/val 모두에 동일 매핑을 전달하세요.
Also applies to: 227-228
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/transformer/train_kaggle.py` around lines 155 - 159,
The train/val split currently builds separate ID maps causing inconsistent
embedding indices; instead build ticker_to_id and sector_to_id once from the
single full_df and pass those same dicts into train_single_pipeline so both
train and val use identical mappings; locate where tickers/sectors and
ticker_to_id/sector_to_id are created and move that logic to run once (using
full_df.unique as shown) and ensure train_single_pipeline signature/caller uses
the shared ticker_to_id and sector_to_id (also update the places noted around
lines 227-228 to accept the shared maps).
| tickers = full_df['ticker'].unique() | ||
| n_val = max(1, int(len(tickers) * 0.2)) | ||
| val_tickers = tickers[-n_val:] | ||
| train_tickers = tickers[:-n_val] | ||
|
|
||
| train_df = full_df[full_df['ticker'].isin(train_tickers)].copy() | ||
| val_df = full_df[full_df['ticker'].isin(val_tickers)].copy() | ||
| print(f">> Train: {len(train_tickers)}개, Val: {len(val_tickers)}개 종목") |
There was a problem hiding this comment.
티커 수가 1개일 때 현재 분할 로직은 학습 불가능 상태를 만듭니다.
n_val=max(1, int(...)) 때문에 len(tickers)==1이면 train이 비어 scaler fitting 단계에서 실패합니다. 최소 티커 수 검증이 필요합니다.
수정 예시
tickers = full_df['ticker'].unique()
+ if len(tickers) < 2:
+ raise ValueError(f"학습을 위한 ticker가 부족합니다. 현재 {len(tickers)}개")
- n_val = max(1, int(len(tickers) * 0.2))
+ n_val = max(1, min(int(len(tickers) * 0.2), len(tickers) - 1))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| tickers = full_df['ticker'].unique() | |
| n_val = max(1, int(len(tickers) * 0.2)) | |
| val_tickers = tickers[-n_val:] | |
| train_tickers = tickers[:-n_val] | |
| train_df = full_df[full_df['ticker'].isin(train_tickers)].copy() | |
| val_df = full_df[full_df['ticker'].isin(val_tickers)].copy() | |
| print(f">> Train: {len(train_tickers)}개, Val: {len(val_tickers)}개 종목") | |
| tickers = full_df['ticker'].unique() | |
| if len(tickers) < 2: | |
| raise ValueError(f"학습을 위한 ticker가 부족합니다. 현재 {len(tickers)}개") | |
| n_val = max(1, min(int(len(tickers) * 0.2), len(tickers) - 1)) | |
| val_tickers = tickers[-n_val:] | |
| train_tickers = tickers[:-n_val] | |
| train_df = full_df[full_df['ticker'].isin(train_tickers)].copy() | |
| val_df = full_df[full_df['ticker'].isin(val_tickers)].copy() | |
| print(f">> Train: {len(train_tickers)}개, Val: {len(val_tickers)}개 종목") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/transformer/train_kaggle.py` around lines 216 - 223,
The current split can leave train empty when tickers has length 1; add a
validation on tickers (the variable computed from full_df['ticker'].unique()) to
ensure at least 2 unique tickers before computing
n_val/train_tickers/val_tickers, and if not, raise a clear ValueError
instructing the user to provide more tickers (or alternatively handle the
single-ticker case by setting n_val=0 and val_tickers empty); update the logic
around n_val, val_tickers, train_tickers, train_df and val_df accordingly so
scaler fitting later (on train_df) never receives an empty DataFrame.
| """모델 가중치를 서버에 배포""" | ||
| print(f"\n>> [{model['name']}] 배포 중...") | ||
|
|
||
| ssh.exec_command(f"mkdir -p {model['remote_dir']}") |
There was a problem hiding this comment.
원격 명령에 경로를 그대로 삽입하고 있습니다.
Line 101은 exec_command(f"mkdir -p {model['remote_dir']}") 형태라 경로 값에 공백/메타문자가 있으면 의도치 않은 명령 실행 위험이 있습니다. 최소한 셸 이스케이프를 적용해 주세요.
수정 예시
+import shlex
...
- ssh.exec_command(f"mkdir -p {model['remote_dir']}")
+ remote_dir_q = shlex.quote(model["remote_dir"])
+ ssh.exec_command(f"mkdir -p {remote_dir_q}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ssh.exec_command(f"mkdir -p {model['remote_dir']}") | |
| import shlex | |
| remote_dir_q = shlex.quote(model["remote_dir"]) | |
| ssh.exec_command(f"mkdir -p {remote_dir_q}") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/scripts/deploy_to_server.py` at line 101, The remote mkdir call injects
model['remote_dir'] directly into ssh.exec_command which is unsafe for paths
containing spaces or shell metacharacters; change the code that calls
ssh.exec_command(f"mkdir -p {model['remote_dir']}") to either (A) shell-escape
or quote the path using shlex.quote(model['remote_dir']) before interpolating,
or (B) avoid the shell entirely and use the SSH library's SFTP API (e.g.,
paramiko.SFTPClient.mkdir or sftp.makedirs) to create the directory; update the
location where ssh.exec_command is invoked and ensure any alternate approach
handles parent dirs and errors robustly.
| OUTPUT_DIR = os.path.join(project_root, "AI/data/kaggle_data") | ||
| KAGGLE_USERNAME = os.environ.get("KAGGLE_USERNAME", "jihyeongkimm") | ||
| DATASET_SLUG = "sisc-ai-trading-dataset" | ||
|
|
There was a problem hiding this comment.
업로드 대상 검증이 없어 잘못된 데이터셋을 업데이트할 위험이 있습니다.
현재 출력 메시지는 KAGGLE_USERNAME/DATASET_SLUG를 보여주지만, 실제 업로드 대상은 dataset-metadata.json의 id에 의해 결정됩니다. 실행 전 메타데이터 존재/ID 일치 검증을 추가해 주세요.
수정 예시
import os
+import json
import subprocess
import sys
@@
print(f"\n>> Kaggle 데이터셋 업로드 중...")
print(f" 대상: {KAGGLE_USERNAME}/{DATASET_SLUG}")
+
+metadata_path = os.path.join(OUTPUT_DIR, "dataset-metadata.json")
+if not os.path.exists(metadata_path):
+ print(f"❌ dataset-metadata.json 없음: {metadata_path}")
+ sys.exit(1)
+
+with open(metadata_path, "r", encoding="utf-8") as f:
+ metadata = json.load(f)
+
+expected_id = f"{KAGGLE_USERNAME}/{DATASET_SLUG}"
+if metadata.get("id") != expected_id:
+ print(f"❌ dataset id 불일치: expected={expected_id}, actual={metadata.get('id')}")
+ sys.exit(1)Also applies to: 58-67
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/scripts/upload_to_kaggle.py` around lines 24 - 27, Before attempting
upload, validate that the dataset metadata file exists and its "id" matches the
expected target constructed from KAGGLE_USERNAME and DATASET_SLUG: read and
parse dataset-metadata.json, confirm the "id" field equals
f"{KAGGLE_USERNAME}/{DATASET_SLUG}", and if missing or mismatched log an error
via the same logger and abort (exit non-zero). Update places that print the
upload target (where OUTPUT_DIR, KAGGLE_USERNAME, DATASET_SLUG are referenced
and in the upload routine around lines ~58-67) to show the actual id from
dataset-metadata.json when proceeding. Ensure failures handle
FileNotFoundError/JSON parse errors with clear error messages and do not attempt
to call the Kaggle upload step when validation fails.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (3)
AI/modules/signal/models/TCN/train_kaggle.py (2)
288-288:⚠️ Potential issue | 🟠 Major
end_date고정값(2023-12-31) 때문에 자동 재학습이 최신 구간을 지속적으로 누락합니다.PR 목적이 주간 자동 재학습이라면, 기본값은 실행일 기준 동적 값(또는 ENV override)으로 두는 편이 안전합니다.
🔧 제안 수정
+ from datetime import date + + def _default_end_date() -> str: + return os.environ.get("END_DATE", date.today().isoformat()) ... - parser.add_argument("--end-date", default="2023-12-31") + parser.add_argument("--end-date", default=_default_end_date()) ... - end_date = "2023-12-31", + end_date = _default_end_date(),Also applies to: 309-310
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/TCN/train_kaggle.py` at line 288, The hardcoded default for the CLI flag in parser.add_argument("--end-date", default="2023-12-31") causes automated retraining to skip recent data; change the default to a dynamic value (e.g., compute datetime.date.today().isoformat() at runtime or accept an environment override) or set the argparse default to None and compute end_date from datetime.date.today() later in the script; update all occurrences of the same flag (see other parser.add_argument calls around the second mention at lines ~309-310) so the pipeline uses the current execution date by default while still allowing explicit CLI/ENV overrides.
161-174:⚠️ Potential issue | 🟠 Major날짜 분할 경계에서 빈 train/val이 발생하면 Line 172에서 바로 실패합니다.
Line 162 계산 방식은 고유 날짜 수가 적거나 분포가 불균형할 때
train_df또는val_df를 비울 수 있습니다.scaler.fit(...)전에 분할 유효성 검증이 필요합니다.🔧 제안 수정
dates = raw_df['date'].sort_values().unique() - split_date_idx = int(len(dates) * 0.8) + if len(dates) < 2: + raise ValueError(f"날짜 분할에 필요한 데이터가 부족합니다. unique_dates={len(dates)}") + split_date_idx = min(max(1, int(len(dates) * 0.8)), len(dates) - 1) split_date = dates[split_date_idx] train_df = raw_df[raw_df['date'] < split_date].copy() val_df = raw_df[raw_df['date'] >= split_date].copy() + if train_df.empty or val_df.empty: + raise ValueError( + f"분할 결과가 비었습니다. split_date={split_date}, train={len(train_df)}, val={len(val_df)}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/TCN/train_kaggle.py` around lines 161 - 174, The split by unique dates can produce an empty train_df or val_df before scaler.fit; update the logic around split_date_idx/split_date to validate and adjust the split so both train_df and val_df are non-empty before scaling: after computing dates and split_date_idx, loop or adjust split_date_idx (e.g., move +1 or -1) until raw_df[date < split_date] and raw_df[date >= split_date] both contain rows, or fall back to a safe default split (e.g., single-date guard or raise a clear error), then proceed to fit StandardScaler on train_df[FEATURE_COLUMNS] and transform both; ensure you reference split_date_idx, split_date, train_df, val_df, FEATURE_COLUMNS, and scaler in the change.AI/kaggle_notebooks/TCN/notebook.ipynb (1)
20-23:⚠️ Potential issue | 🟠 Major브랜치 하드코딩과 clone 실패 미검증을 함께 정리해 주세요.
Line 21의 PR 전용 브랜치 하드코딩은 운영 안정성 리스크가 큽니다. 또한 Line 20-23에서 clone 실패를 즉시 중단하지 않아 후속 셀에서 원인 불명의 실패로 이어질 수 있습니다.
🔧 제안 수정
- import subprocess + import os + import subprocess + branch = os.environ.get("GIT_CLONE_BRANCH", "main") result = subprocess.run( - ['git', 'clone', '--depth', '1', '--branch', '20260326-#343-AI-kaggle-pipeline', 'https://github.com/SISC-IT/sisc-web.git', '/kaggle/working/sisc-web'], - capture_output=True, text=True + ['git', 'clone', '--depth', '1', '--branch', branch, 'https://github.com/SISC-IT/sisc-web.git', '/kaggle/working/sisc-web'], + capture_output=True, text=True, check=True )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/kaggle_notebooks/TCN/notebook.ipynb` around lines 20 - 23, 현재 셀에서 subprocess.run으로 git clone을 수행할 때 브랜치 이름('20260326-#343-AI-kaggle-pipeline')을 하드코딩하고 복제 실패 시 처리를 하지 않아 이후 셀들이 원인 불명의 에러를 일으킬 수 있습니다; subprocess.run 호출을 수정해 브랜치와 리포지터리 URL을 하드코딩하지 않고 환경변수 또는 함수/변수 인자로 주입(예: BRANCH_NAME, REPO_URL 또는 notebook parameter)하도록 변경하고, subprocess.run(..., capture_output=True, text=True) 호출 결과의 returncode를 확인하여 실패하면 stderr/stdout를 로그로 출력하고 예외를 발생시키거나 즉시 중단(exit/raise)하도록 처리하여 후속 실행을 막으세요 (참조: subprocess.run 호출, 클론 대상 경로 '/kaggle/working/sisc-web').
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Line 55: The config currently hardcodes "train_end_date" to "2023-12-31",
causing weekly retraining to ignore new data; change the default in the config
dict (the "train_end_date" entry) to None (or remove the fixed date) so the
training pipeline (functions using train_end_date in train_kaggle.py) interprets
None as "use the latest available data"; also update the other hardcoded
occurrences referenced in the diff (the later "train_end_date" usages) to follow
the same pattern and ensure any logic that slices data treats None as no cutoff
(and only applies an external cutoff when an explicit date string is provided).
- Around line 138-146: The macro change calculations rely on row order; sort
macro_df by the "date" column (e.g., macro_df = macro_df.sort_values("date"))
and then deduplicate explicitly (use drop_duplicates(subset="date", keep="last"
or "first") depending on desired policy) before calling diff()/pct_change() so
us10y_chg, dxy_chg, and vix_change_rate are computed on chronologically ordered
data; apply this to the code paths around macro_df, available_macro, and the
subsequent if blocks referencing "us10y", "dxy_close", and "vix_close".
- Around line 223-247: Replace the two uses of layers.Lambda(lambda t:
tf.transpose(...)) with the built-in Permute layer to avoid Lambda serialization
issues: locate the transpose applied to seq_input (currently assigned to x via
layers.Lambda(lambda t: tf.transpose(t, perm=[0, 2, 1]))(seq_input)) and the
final transpose applied to x (layers.Lambda(lambda t: tf.transpose(t, perm=[0,
2, 1]))(x)) and change them to layers.Permute((2,1)) so the model uses Permute
around the Transformer blocks (preserving the [batch, seq, feat] → [batch, feat,
seq] and back transformations) and can be safely saved/loaded without
custom_objects.
In `@AI/modules/signal/models/TCN/train_kaggle.py`:
- Around line 273-281: The _find_kaggle_dataset_path function currently returns
any directory under /kaggle/input that contains any .parquet file; update it to
specifically check for the presence of "price_data.parquet" (instead of any
.parquet) before returning a directory, and fall back to
os.environ.get("PARQUET_DIR", base) if no directory contains price_data.parquet;
keep the same return behavior and error/fallback semantics but use the filename
match to avoid false-positive dataset paths.
---
Duplicate comments:
In `@AI/kaggle_notebooks/TCN/notebook.ipynb`:
- Around line 20-23: 현재 셀에서 subprocess.run으로 git clone을 수행할 때 브랜치
이름('20260326-#343-AI-kaggle-pipeline')을 하드코딩하고 복제 실패 시 처리를 하지 않아 이후 셀들이 원인 불명의
에러를 일으킬 수 있습니다; subprocess.run 호출을 수정해 브랜치와 리포지터리 URL을 하드코딩하지 않고 환경변수 또는 함수/변수
인자로 주입(예: BRANCH_NAME, REPO_URL 또는 notebook parameter)하도록 변경하고,
subprocess.run(..., capture_output=True, text=True) 호출 결과의 returncode를 확인하여 실패하면
stderr/stdout를 로그로 출력하고 예외를 발생시키거나 즉시 중단(exit/raise)하도록 처리하여 후속 실행을 막으세요 (참조:
subprocess.run 호출, 클론 대상 경로 '/kaggle/working/sisc-web').
In `@AI/modules/signal/models/TCN/train_kaggle.py`:
- Line 288: The hardcoded default for the CLI flag in
parser.add_argument("--end-date", default="2023-12-31") causes automated
retraining to skip recent data; change the default to a dynamic value (e.g.,
compute datetime.date.today().isoformat() at runtime or accept an environment
override) or set the argparse default to None and compute end_date from
datetime.date.today() later in the script; update all occurrences of the same
flag (see other parser.add_argument calls around the second mention at lines
~309-310) so the pipeline uses the current execution date by default while still
allowing explicit CLI/ENV overrides.
- Around line 161-174: The split by unique dates can produce an empty train_df
or val_df before scaler.fit; update the logic around split_date_idx/split_date
to validate and adjust the split so both train_df and val_df are non-empty
before scaling: after computing dates and split_date_idx, loop or adjust
split_date_idx (e.g., move +1 or -1) until raw_df[date < split_date] and
raw_df[date >= split_date] both contain rows, or fall back to a safe default
split (e.g., single-date guard or raise a clear error), then proceed to fit
StandardScaler on train_df[FEATURE_COLUMNS] and transform both; ensure you
reference split_date_idx, split_date, train_df, val_df, FEATURE_COLUMNS, and
scaler in the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d891ff15-e946-4172-9146-14aadf44f21f
📒 Files selected for processing (3)
AI/kaggle_notebooks/TCN/notebook.ipynbAI/modules/signal/models/TCN/train_kaggle.pyAI/modules/signal/models/itransformer/train_kaggle.py
| CONFIG = { | ||
| "lookback" : 60, | ||
| "horizons" : [1, 3, 5, 7], | ||
| "train_end_date" : "2023-12-31", |
There was a problem hiding this comment.
주간 재학습 파이프라인인데 학습 종료일이 2023-12-31로 고정돼 있습니다.
이 설정이면 2024-01-01 이후에 적재된 데이터는 매주 다시 학습해도 전부 버려집니다. PR 목표가 최신 데이터 기반 자동 재학습이라면, 기본값은 전체 최신 데이터를 사용하고 백필/재현이 필요할 때만 외부 설정으로 cutoff를 주는 쪽이 맞습니다.
🛠️ 예시 수정
CONFIG = {
- "train_end_date" : "2023-12-31",
+ "train_end_date" : os.environ.get("TRAIN_END_DATE"), # None이면 최신 데이터까지 사용
...
- df = df[df["date"] <= CONFIG["train_end_date"]]
+ if CONFIG["train_end_date"]:
+ df = df[df["date"] <= pd.Timestamp(CONFIG["train_end_date"])]Also applies to: 156-157
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/itransformer/train_kaggle.py` at line 55, The config
currently hardcodes "train_end_date" to "2023-12-31", causing weekly retraining
to ignore new data; change the default in the config dict (the "train_end_date"
entry) to None (or remove the fixed date) so the training pipeline (functions
using train_end_date in train_kaggle.py) interprets None as "use the latest
available data"; also update the other hardcoded occurrences referenced in the
diff (the later "train_end_date" usages) to follow the same pattern and ensure
any logic that slices data treats None as no cutoff (and only applies an
external cutoff when an explicit date string is provided).
| macro_df = macro_df[available_macro].drop_duplicates("date") | ||
|
|
||
| # 변화율 계산 | ||
| if "us10y" in macro_df.columns: | ||
| macro_df["us10y_chg"] = macro_df["us10y"].diff() | ||
| if "dxy_close" in macro_df.columns: | ||
| macro_df["dxy_chg"] = macro_df["dxy_close"].pct_change() | ||
| if "vix_close" in macro_df.columns: | ||
| macro_df["vix_change_rate"] = macro_df["vix_close"].pct_change() |
There was a problem hiding this comment.
거시 변화율 계산 전에 날짜 정렬이 빠져 있습니다.
diff()/pct_change()는 현재 행 순서에 의존합니다. parquet 로드 순서가 날짜순이 아니면 us10y_chg, dxy_chg, vix_change_rate가 잘못 계산돼서 학습 피처가 조용히 오염됩니다. 중복 제거도 정렬 후에 명시적인 keep 정책으로 처리하는 편이 안전합니다.
🛠️ 예시 수정
- macro_df = macro_df[available_macro].drop_duplicates("date")
+ macro_df = (
+ macro_df[available_macro]
+ .sort_values("date")
+ .drop_duplicates("date", keep="last")
+ .reset_index(drop=True)
+ )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/itransformer/train_kaggle.py` around lines 138 -
146, The macro change calculations rely on row order; sort macro_df by the
"date" column (e.g., macro_df = macro_df.sort_values("date")) and then
deduplicate explicitly (use drop_duplicates(subset="date", keep="last" or
"first") depending on desired policy) before calling diff()/pct_change() so
us10y_chg, dxy_chg, and vix_change_rate are computed on chronologically ordered
data; apply this to the code paths around macro_df, available_macro, and the
subsequent if blocks referencing "us10y", "dxy_close", and "vix_close".
| def _find_kaggle_dataset_path() -> str: | ||
| """Kaggle 입력 데이터셋 경로를 자동으로 탐색""" | ||
| base = "/kaggle/input" | ||
| if os.path.exists(base): | ||
| for entry in os.listdir(base): | ||
| full = os.path.join(base, entry) | ||
| if os.path.isdir(full) and any(f.endswith(".parquet") for f in os.listdir(full)): | ||
| return full | ||
| return os.environ.get("PARQUET_DIR", base) |
There was a problem hiding this comment.
데이터셋 자동 탐색이 price_data.parquet를 보장하지 않아 오탐 경로를 반환할 수 있습니다.
Line 115에서 price_data.parquet를 고정으로 읽는데, 현재 탐색 로직은 “아무 parquet”만 확인합니다. 디렉터리 선택 기준을 파일명까지 맞춰 주세요.
🔧 제안 수정
def _find_kaggle_dataset_path() -> str:
"""Kaggle 입력 데이터셋 경로를 자동으로 탐색"""
base = "/kaggle/input"
if os.path.exists(base):
- for entry in os.listdir(base):
+ for entry in sorted(os.listdir(base)):
full = os.path.join(base, entry)
- if os.path.isdir(full) and any(f.endswith(".parquet") for f in os.listdir(full)):
+ if os.path.isdir(full) and os.path.exists(os.path.join(full, "price_data.parquet")):
return full
return os.environ.get("PARQUET_DIR", base)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/TCN/train_kaggle.py` around lines 273 - 281, The
_find_kaggle_dataset_path function currently returns any directory under
/kaggle/input that contains any .parquet file; update it to specifically check
for the presence of "price_data.parquet" (instead of any .parquet) before
returning a directory, and fall back to os.environ.get("PARQUET_DIR", base) if
no directory contains price_data.parquet; keep the same return behavior and
error/fallback semantics but use the filename match to avoid false-positive
dataset paths.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
AI/modules/signal/models/itransformer/train_kaggle.py (3)
55-55:⚠️ Potential issue | 🟠 Major주간 재학습 목적과 충돌하는 고정 종료일 설정입니다.
train_end_date가"2023-12-31"로 고정되어 있고(Line 157) 항상 컷오프가 적용되어, 2024-01-01 이후 데이터가 매주 학습에서 제외됩니다. 기본값은None(최신까지)로 두고, 명시적 백필 시에만 환경변수로 제한하는 방식이 안전합니다.수정 예시
CONFIG = { - "train_end_date" : "2023-12-31", + "train_end_date" : os.environ.get("TRAIN_END_DATE"), ... } ... - df = df[df["date"] <= CONFIG["train_end_date"]] + if CONFIG["train_end_date"]: + df = df[df["date"] <= pd.Timestamp(CONFIG["train_end_date"])]Also applies to: 156-157
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/itransformer/train_kaggle.py` at line 55, The hardcoded train_end_date value ("2023-12-31") in train_kaggle.py causes post-2023 data to be excluded every week; change the default for train_end_date to None (or to read an optional BACKFILL_TRAIN_END_DATE env var) so training uses data up to "now" by default, and only apply a fixed cutoff when an explicit environment variable is provided for backfill runs; update references that read train_end_date (the variable named train_end_date around the train configuration/initialization) to handle None as “no cutoff” and convert an explicit env var to a date when present.
223-224:⚠️ Potential issue | 🟠 Major
Lambda(tf.transpose)는 저장/로드 호환성 리스크가 큽니다.현재 wrapper 로드 경로(
load_model(..., compile=False)) 기준으로는Lambda직렬화 이슈가 재발할 수 있습니다.Permute로 바꾸는 쪽이 안전합니다.수정 예시
- x = layers.Lambda(lambda t: tf.transpose(t, perm=[0, 2, 1]))(seq_input) + x = layers.Permute((2, 1), name="to_feature_tokens")(seq_input) ... - x = layers.Lambda(lambda t: tf.transpose(t, perm=[0, 2, 1]))(x) + x = layers.Permute((2, 1), name="to_time_major")(x)TensorFlow Keras에서 Lambda 레이어를 .keras로 저장한 뒤 custom_objects 없이 load_model 할 때의 제약사항과, Permute 레이어 사용 권장 여부를 공식 문서 기준으로 확인해줘.Also applies to: 247-247
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/itransformer/train_kaggle.py` around lines 223 - 224, Replace the serialization-unsafe Lambda(tf.transpose) usage with Keras' built-in Permute layer to avoid load/save custom object issues: locate the Lambda call that transposes seq_input (layers.Lambda(lambda t: tf.transpose(t, perm=[0,2,1]))(seq_input)) and replace it with layers.Permute((2,1)) applied to the input (same replacement for the duplicate at the other occurrence), ensuring the perm tuple swaps the time and feature axes (Permute uses dims excluding batch).
138-146:⚠️ Potential issue | 🟠 Major거시 변화율 계산 전에 날짜 정렬이 보장되지 않습니다.
diff()/pct_change()는 행 순서 의존이라, 현재처럼drop_duplicates("date")만 하면 입력 순서가 비정렬일 때us10y_chg,dxy_chg,vix_change_rate가 왜곡될 수 있습니다.수정 예시
- macro_df = macro_df[available_macro].drop_duplicates("date") + macro_df = ( + macro_df[available_macro] + .sort_values("date") + .drop_duplicates("date", keep="last") + .reset_index(drop=True) + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/itransformer/train_kaggle.py` around lines 138 - 146, The change-rate calculations for us10y/dxy/vix are order-dependent; before calling diff()/pct_change() ensure macro_df is sorted by the date column and that date is a proper datetime type: convert the date column to datetime if needed, run macro_df = macro_df.sort_values("date") (or sort_values(by=["date"]) and then drop_duplicates("date")/reset_index(drop=True) so that the subsequent us10y_chg, dxy_chg and vix_change_rate computed from macro_df["us10y"].diff() and pct_change() use correctly ordered rows.
🧹 Nitpick comments (1)
AI/modules/signal/models/itransformer/train_kaggle.py (1)
41-41: 미사용 루프 변수는_dirs,_ticker로 바꾸면 린트 경고를 줄일 수 있습니다.동작 영향은 없지만 Ruff(B007) 경고 제거에 도움이 됩니다.
Also applies to: 188-188
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/modules/signal/models/itransformer/train_kaggle.py` at line 41, Rename unused loop variables to start with an underscore to silence Ruff B007: change the second variable named dirs to _dirs in the os.walk loop (the line containing "for root, dirs, files in os.walk(base):") and likewise rename any unused ticker variable at the other occurrence (the loop around line 188 where _ticker is suggested) so they become _dirs and _ticker respectively; this preserves behavior while removing lint warnings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Around line 216-221: The model built by build_model currently only declares a
single Input("sequence_input") while the inference wrapper expects three inputs
([feature_frame, ticker_id, sector_id]); update build_model to expose a matching
input signature by adding inputs for ticker_id and sector_id (e.g., Input layers
for those IDs), incorporate their embeddings or processing into the model
pipeline (concatenate or broadcast/expand as needed with the sequence_input
features), and return a tf.keras.Model that accepts the three tensors so
wrapper.predict([feature_frame, ticker_id, sector_id]) will match the saved
model signature.
---
Duplicate comments:
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Line 55: The hardcoded train_end_date value ("2023-12-31") in train_kaggle.py
causes post-2023 data to be excluded every week; change the default for
train_end_date to None (or to read an optional BACKFILL_TRAIN_END_DATE env var)
so training uses data up to "now" by default, and only apply a fixed cutoff when
an explicit environment variable is provided for backfill runs; update
references that read train_end_date (the variable named train_end_date around
the train configuration/initialization) to handle None as “no cutoff” and
convert an explicit env var to a date when present.
- Around line 223-224: Replace the serialization-unsafe Lambda(tf.transpose)
usage with Keras' built-in Permute layer to avoid load/save custom object
issues: locate the Lambda call that transposes seq_input (layers.Lambda(lambda
t: tf.transpose(t, perm=[0,2,1]))(seq_input)) and replace it with
layers.Permute((2,1)) applied to the input (same replacement for the duplicate
at the other occurrence), ensuring the perm tuple swaps the time and feature
axes (Permute uses dims excluding batch).
- Around line 138-146: The change-rate calculations for us10y/dxy/vix are
order-dependent; before calling diff()/pct_change() ensure macro_df is sorted by
the date column and that date is a proper datetime type: convert the date column
to datetime if needed, run macro_df = macro_df.sort_values("date") (or
sort_values(by=["date"]) and then drop_duplicates("date")/reset_index(drop=True)
so that the subsequent us10y_chg, dxy_chg and vix_change_rate computed from
macro_df["us10y"].diff() and pct_change() use correctly ordered rows.
---
Nitpick comments:
In `@AI/modules/signal/models/itransformer/train_kaggle.py`:
- Line 41: Rename unused loop variables to start with an underscore to silence
Ruff B007: change the second variable named dirs to _dirs in the os.walk loop
(the line containing "for root, dirs, files in os.walk(base):") and likewise
rename any unused ticker variable at the other occurrence (the loop around line
188 where _ticker is suggested) so they become _dirs and _ticker respectively;
this preserves behavior while removing lint warnings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 92d1520b-628c-4aa1-ade0-8ee6f4e79ba4
📒 Files selected for processing (1)
AI/modules/signal/models/itransformer/train_kaggle.py
| def build_model(seq_len: int, n_features: int, n_outputs: int) -> tf.keras.Model: | ||
| """iTransformer: 변수(feature) 축을 토큰으로 취급하는 Transformer""" | ||
| from tensorflow.keras import layers | ||
|
|
||
| # 입력 | ||
| seq_input = tf.keras.Input(shape=(seq_len, n_features), name="sequence_input") |
There was a problem hiding this comment.
학습 모델 입력 시그니처가 추론 래퍼와 불일치합니다.
이 파일은 단일 입력(sequence_input) 모델을 저장하지만, AI/modules/signal/models/itransformer/wrapper.py는 [feature_frame, ticker_id, sector_id] 3입력으로 predict()를 호출합니다. 현재 상태면 배포 후 추론 단계에서 입력 shape/arity 오류 가능성이 높습니다.
정렬 방향 제안
- model = tf.keras.Model(inputs=seq_input, outputs=output)
+ # wrapper 계약에 맞추려면 train/serve 양쪽 중 하나로 표준화 필요:
+ # (A) wrapper를 단일 입력 모델로 변경하거나
+ # (B) train_kaggle 모델을 3입력 시그니처로 맞춤
+ model = tf.keras.Model(inputs=seq_input, outputs=output)Also applies to: 259-259
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/modules/signal/models/itransformer/train_kaggle.py` around lines 216 -
221, The model built by build_model currently only declares a single
Input("sequence_input") while the inference wrapper expects three inputs
([feature_frame, ticker_id, sector_id]); update build_model to expose a matching
input signature by adding inputs for ticker_id and sector_id (e.g., Input layers
for those IDs), incorporate their embeddings or processing into the model
pipeline (concatenate or broadcast/expand as needed with the sequence_input
features), and return a tf.keras.Model that accepts the three tensors so
wrapper.predict([feature_frame, ticker_id, sector_id]) will match the saved
model signature.
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
AI/kaggle_notebooks/patchtst/notebook.ipynb (1)
20-20:⚠️ Potential issue | 🟠 Major브랜치 하드코딩을 제거해 주세요.
Line 20의 고정 브랜치(
20260326-#343-AI-kaggle-pipeline)는 브랜치 정리/이름 변경 시 파이프라인을 즉시 깨뜨립니다. 기본값main+ 환경변수 override 방식으로 바꾸는 것이 안전합니다.수정 예시
- "import subprocess\n", + "import os, subprocess\n", + "branch = os.environ.get('SISC_TRAIN_BRANCH', 'main')\n", "result = subprocess.run(\n", - " ['git', 'clone', '--depth', '1', '--branch', '20260326-#343-AI-kaggle-pipeline', 'https://github.com/SISC-IT/sisc-web.git', '/kaggle/working/sisc-web'],\n", + " ['git', 'clone', '--depth', '1', '--branch', branch, 'https://github.com/SISC-IT/sisc-web.git', '/kaggle/working/sisc-web'],\n", " capture_output=True, text=True\n", ")\n",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/kaggle_notebooks/patchtst/notebook.ipynb` at line 20, The git clone command in the notebook hardcodes the branch name ('20260326-#343-AI-kaggle-pipeline') which will break when branches are renamed; change the clone invocation string to use a branch variable that defaults to "main" and can be overridden via an environment variable (e.g., read from os.environ or notebook parameters) so the command uses that variable instead of the fixed literal; locate the git clone string in the notebook cell (the line containing "git clone --depth '1' --branch ... 'https://github.com/SISC-IT/sisc-web.git'") and replace the fixed branch token with the branch variable lookup and default logic.AI/scripts/download_weights.py (1)
158-167:⚠️ Potential issue | 🟠 Major일부 파일만 다운로드되어도 성공으로 처리됩니다.
keep_files에 명시된 파일 중 일부만 다운로드되어도True를 반환합니다. 예를 들어,.pt파일은 받았지만.pkl스케일러가 누락되면 추론 시 오류가 발생합니다.deploy_to_server.py도 누락 파일을 경고만 하고 계속 진행하므로, 불완전한 모델이 배포될 수 있습니다.모든 필수 파일이 다운로드되었는지 검증 후 성공 여부를 결정하는 것이 안전합니다.
🛡️ 모든 필수 파일 검증 로직
if missing: print(f" [{model['name']}] 경고: 파일 없음 -> {missing}") - if not copied: + print(f" [{model['name']}] 실패: 필수 파일 누락") + return False + elif not copied: print(f" [{model['name']}] 실패: 가중치 파일 없음") return False print(f" [{model['name']}] 다운로드 완료!")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/download_weights.py` around lines 158 - 167, The current download result logic treats partial downloads as success because it returns True whenever any files were copied; update the post-download validation in the same function that uses the variables copied, missing, and model['name'] to require that all expected files (those listed in keep_files for the model) are present: explicitly check that missing is empty (or that set(keep_files) ⊆ set(copied)) before printing "다운로드 완료!" and returning True, otherwise print a clear error including model['name'] and the missing files and return False; ensure this replaces the existing early return behavior so deploy_to_server.py won't proceed on incomplete models.
🧹 Nitpick comments (1)
AI/scripts/download_weights.py (1)
111-134: CLI 폴백 코드의 스타일 이슈 수정이 권장됩니다.정적 분석에서 몇 가지 코드 품질 이슈가 발견되었습니다:
- Line 116: 미사용 변수
dirs→_dirs로 변경- Line 132: 한 줄에 여러 문장 → 분리
♻️ 스타일 수정
- for root, dirs, files in os.walk(tmp_dir): + for root, _dirs, files in os.walk(tmp_dir): for f in files: if f not in file_map: file_map[f] = os.path.join(root, f) copied = [] for fname in model['keep_files']: if fname in file_map: src = file_map[fname] if os.path.getsize(src) < 100: continue dst = os.path.join(model['dst_dir'], fname) shutil.copy2(src, dst) size = os.path.getsize(dst) / (1024 * 1024) copied.append(f"{fname} ({size:.1f} MB)") if copied: print(f" [{model['name']}] 다운로드 완료!") - for f in copied: print(f" - {f}") + for f in copied: + print(f" - {f}") return True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/download_weights.py` around lines 111 - 134, The code block that builds file_map and prints results has two style issues: change the unused loop variable name from dirs to _dirs in the os.walk unpack (for root, _dirs, files in os.walk(tmp_dir)) to indicate it's intentionally unused, and split the single-line compound statement that prints copied files (for f in copied: print(...)) into a standard loop with the print on its own line so each statement is on its own line; locate these changes around the os.walk loop and the copied-files printing logic in the download_weights.py snippet (references: variable names dirs/_dirs, files, and the copied printing loop).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/kaggle_notebooks/patchtst/notebook.ipynb`:
- Line 44: The notebook's runtime dependencies are installed without pinned
versions in the line that runs pip install (the "!pip install -q scikit-learn
pandas numpy torch tqdm" cell); update that cell to pin or constrain versions to
ensure reproducibility—either (1) run a cell to inspect the Kaggle environment
(e.g., using pip list / pip show) and then pin the detected versions for numpy,
torch, scikit-learn, pandas, and tqdm, or (2) set safe compatibility constraints
such as numpy>=2.2.6 and torch>=2.6.0 (and appropriate versions for
scikit-learn/pandas) so the notebook uses known-compatible releases rather than
no-version installs.
- Around line 19-22: The subprocess.run calls that execute the git clone and
nvidia-smi commands (the subprocess.run invocation for ['git', 'clone', ...
'sisc-web'] and the subprocess.run call for 'nvidia-smi') lack error checking;
update each call to either pass check=True to let Python raise
CalledProcessError on failure or capture the returned CompletedProcess and
explicitly inspect result.returncode and stderr, then log and raise an exception
(or sys.exit) on non-zero return codes so failures are propagated clearly;
locate the subprocess.run invocations by the command lists shown and apply the
same error-handling pattern to both.
In `@AI/scripts/download_weights.py`:
- Around line 59-69: The requests.get call in list_output_files can hang because
it lacks a timeout; update the call to pass a reasonable timeout (e.g.,
timeout=10) and handle timeout exceptions (requests.exceptions.Timeout and
requests.exceptions.RequestException) around the call in the list_output_files
function so you return an empty list or log an error on failure; ensure the
auth=(KAGGLE_USERNAME, KAGGLE_KEY) usage remains and only the requests.get call
and its exception handling are changed.
- Around line 38-43: The Transformer weight filenames in the download config
mismatch deploy_to_server.py expectations (it expects names with _prod): update
the download step to either output the Kaggle files with the _prod suffix or add
a rename mapping in the download_weights logic (e.g., inside the
download_weights function) that maps "multi_horizon_model.keras" ->
"multi_horizon_model_prod.keras" and "multi_horizon_scaler.pkl" ->
"multi_horizon_scaler_prod.pkl" and apply os.rename after download so the files
match deploy_to_server.py's expected names.
- Around line 72-93: The download_file function currently calls requests.get
without timeouts; add explicit connect and read timeouts to avoid indefinite
hangs (e.g., pass timeout=(10, 300) to requests.get while keeping stream=True),
and ensure the timeout tuple is used when calling requests.get in download_file
so large weight downloads can fail fast on connect issues and allow a longer
read window; keep the existing response handling (status check, iter_content,
size check) unchanged.
---
Duplicate comments:
In `@AI/kaggle_notebooks/patchtst/notebook.ipynb`:
- Line 20: The git clone command in the notebook hardcodes the branch name
('20260326-#343-AI-kaggle-pipeline') which will break when branches are renamed;
change the clone invocation string to use a branch variable that defaults to
"main" and can be overridden via an environment variable (e.g., read from
os.environ or notebook parameters) so the command uses that variable instead of
the fixed literal; locate the git clone string in the notebook cell (the line
containing "git clone --depth '1' --branch ...
'https://github.com/SISC-IT/sisc-web.git'") and replace the fixed branch token
with the branch variable lookup and default logic.
In `@AI/scripts/download_weights.py`:
- Around line 158-167: The current download result logic treats partial
downloads as success because it returns True whenever any files were copied;
update the post-download validation in the same function that uses the variables
copied, missing, and model['name'] to require that all expected files (those
listed in keep_files for the model) are present: explicitly check that missing
is empty (or that set(keep_files) ⊆ set(copied)) before printing "다운로드 완료!" and
returning True, otherwise print a clear error including model['name'] and the
missing files and return False; ensure this replaces the existing early return
behavior so deploy_to_server.py won't proceed on incomplete models.
---
Nitpick comments:
In `@AI/scripts/download_weights.py`:
- Around line 111-134: The code block that builds file_map and prints results
has two style issues: change the unused loop variable name from dirs to _dirs in
the os.walk unpack (for root, _dirs, files in os.walk(tmp_dir)) to indicate it's
intentionally unused, and split the single-line compound statement that prints
copied files (for f in copied: print(...)) into a standard loop with the print
on its own line so each statement is on its own line; locate these changes
around the os.walk loop and the copied-files printing logic in the
download_weights.py snippet (references: variable names dirs/_dirs, files, and
the copied printing loop).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 301e7a0c-8800-4e06-9d0a-59dba872f9ea
📒 Files selected for processing (2)
AI/kaggle_notebooks/patchtst/notebook.ipynbAI/scripts/download_weights.py
| "result = subprocess.run(\n", | ||
| " ['git', 'clone', '--depth', '1', '--branch', '20260326-#343-AI-kaggle-pipeline', 'https://github.com/SISC-IT/sisc-web.git', '/kaggle/working/sisc-web'],\n", | ||
| " capture_output=True, text=True\n", | ||
| ")\n", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# subprocess.run 호출 중 check 인자 누락 위치 점검
rg -n "subprocess\.run\(" AI/kaggle_notebooks/patchtst/notebook.ipynb -C2
rg -n "\"check=True\"" AI/kaggle_notebooks/patchtst/notebook.ipynb -C1Repository: SISC-IT/sisc-web
Length of output: 623
🏁 Script executed:
# Extract the notebook and examine the context around subprocess.run calls
python3 << 'EOF'
import json
try:
with open('AI/kaggle_notebooks/patchtst/notebook.ipynb', 'r') as f:
notebook = json.load(f)
# Find cells with subprocess.run
for i, cell in enumerate(notebook['cells']):
if cell['cell_type'] == 'code':
source = ''.join(cell['source'])
if 'subprocess.run' in source:
print(f"\n=== Cell {i} ===")
print(source)
print("---")
except Exception as e:
print(f"Error: {e}")
EOFRepository: SISC-IT/sisc-web
Length of output: 633
🏁 Script executed:
# Also check if there are any try-except blocks or returncode checks
rg -n "returncode|except|try" AI/kaggle_notebooks/patchtst/notebook.ipynb -C3Repository: SISC-IT/sisc-web
Length of output: 42
서브프로세스 호출에서 오류 검증이 누락되었습니다.
git clone(Line 19-22)과 nvidia-smi(Line 47) 호출 모두 실패 시 오류 처리가 없습니다. 오류 발생 시 명확한 실패 전파를 위해 check=True를 추가하거나 returncode를 명시적으로 검사해야 합니다.
수정 예시
result = subprocess.run(
['git', 'clone', '--depth', '1', '--branch', '20260326-#343-AI-kaggle-pipeline', 'https://github.com/SISC-IT/sisc-web.git', '/kaggle/working/sisc-web'],
- capture_output=True, text=True
+ capture_output=True, text=True, check=True
)- result = subprocess.run(['nvidia-smi', '--query-gpu=name', '--format=csv,noheader'], capture_output=True, text=True)
+ result = subprocess.run(['nvidia-smi', '--query-gpu=name', '--format=csv,noheader'], capture_output=True, text=True, check=True)🧰 Tools
🪛 Ruff (0.15.7)
[error] 19-19: Starting a process with a partial executable path
(S607)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/kaggle_notebooks/patchtst/notebook.ipynb` around lines 19 - 22, The
subprocess.run calls that execute the git clone and nvidia-smi commands (the
subprocess.run invocation for ['git', 'clone', ... 'sisc-web'] and the
subprocess.run call for 'nvidia-smi') lack error checking; update each call to
either pass check=True to let Python raise CalledProcessError on failure or
capture the returned CompletedProcess and explicitly inspect result.returncode
and stderr, then log and raise an exception (or sys.exit) on non-zero return
codes so failures are propagated clearly; locate the subprocess.run invocations
by the command lists shown and apply the same error-handling pattern to both.
| { | ||
| "name" : "Transformer", | ||
| "slug" : f"{KAGGLE_USERNAME}/transformer-training", | ||
| "dst_dir" : os.path.join(project_root, "AI/data/weights/transformer/prod"), | ||
| "keep_files": ["multi_horizon_model.keras", "multi_horizon_scaler.pkl"], | ||
| }, |
There was a problem hiding this comment.
Transformer 모델 파일명이 deploy_to_server.py와 불일치합니다.
deploy_to_server.py는 multi_horizon_model_prod.keras와 multi_horizon_scaler_prod.pkl을 기대하지만, 여기서는 _prod 접미사 없이 다운로드합니다. 이로 인해 배포 시 해당 파일을 찾지 못해 Transformer 가중치가 배포되지 않습니다.
두 가지 해결 방안:
- Kaggle 노트북에서 출력 파일명을
_prod접미사로 변경 - 다운로드 후 파일명 변경 (아래 제안)
🐛 파일명 일치 수정 제안
{
"name" : "Transformer",
"slug" : f"{KAGGLE_USERNAME}/transformer-training",
"dst_dir" : os.path.join(project_root, "AI/data/weights/transformer/prod"),
- "keep_files": ["multi_horizon_model.keras", "multi_horizon_scaler.pkl"],
+ "keep_files": ["multi_horizon_model_prod.keras", "multi_horizon_scaler_prod.pkl"],
},또는 Kaggle 노트북의 출력 파일명이 변경될 수 없다면, 다운로드 후 rename 로직 추가 필요:
# download_weights 함수 내에서 파일명 매핑 추가
RENAME_MAP = {
"multi_horizon_model.keras": "multi_horizon_model_prod.keras",
"multi_horizon_scaler.pkl": "multi_horizon_scaler_prod.pkl",
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| { | |
| "name" : "Transformer", | |
| "slug" : f"{KAGGLE_USERNAME}/transformer-training", | |
| "dst_dir" : os.path.join(project_root, "AI/data/weights/transformer/prod"), | |
| "keep_files": ["multi_horizon_model.keras", "multi_horizon_scaler.pkl"], | |
| }, | |
| { | |
| "name" : "Transformer", | |
| "slug" : f"{KAGGLE_USERNAME}/transformer-training", | |
| "dst_dir" : os.path.join(project_root, "AI/data/weights/transformer/prod"), | |
| "keep_files": ["multi_horizon_model_prod.keras", "multi_horizon_scaler_prod.pkl"], | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/scripts/download_weights.py` around lines 38 - 43, The Transformer weight
filenames in the download config mismatch deploy_to_server.py expectations (it
expects names with _prod): update the download step to either output the Kaggle
files with the _prod suffix or add a rename mapping in the download_weights
logic (e.g., inside the download_weights function) that maps
"multi_horizon_model.keras" -> "multi_horizon_model_prod.keras" and
"multi_horizon_scaler.pkl" -> "multi_horizon_scaler_prod.pkl" and apply
os.rename after download so the files match deploy_to_server.py's expected
names.
| def download_file(slug: str, file_name: str, dst_path: str) -> bool: | ||
| """Kaggle API로 특정 파일 다운로드 (스트리밍)""" | ||
| owner, kernel = slug.split("/") | ||
| url = f"https://www.kaggle.com/api/v1/kernels/output/{owner}/{kernel}?fileName={file_name}" | ||
|
|
||
| with requests.get(url, auth=(KAGGLE_USERNAME, KAGGLE_KEY), stream=True) as resp: | ||
| if resp.status_code != 200: | ||
| print(f" [오류] 다운로드 실패: {file_name} ({resp.status_code})") | ||
| return False | ||
|
|
||
| total = int(resp.headers.get("content-length", 0)) | ||
| downloaded = 0 | ||
| with open(dst_path, "wb") as f: | ||
| for chunk in resp.iter_content(chunk_size=8192): | ||
| f.write(chunk) | ||
| downloaded += len(chunk) | ||
|
|
||
| actual = os.path.getsize(dst_path) | ||
| if total > 0 and actual != total: | ||
| print(f" [경고] {file_name} 크기 불일치: 예상 {total}, 실제 {actual}") | ||
| return False | ||
| return True |
There was a problem hiding this comment.
파일 다운로드에도 timeout 설정이 필요합니다.
대용량 가중치 파일 다운로드 시 연결 실패나 중단이 발생하면 무한 대기할 수 있습니다. 연결 타임아웃과 읽기 타임아웃을 분리하여 설정하는 것이 좋습니다.
🛡️ timeout 추가 (connect, read 분리)
- with requests.get(url, auth=(KAGGLE_USERNAME, KAGGLE_KEY), stream=True) as resp:
+ with requests.get(url, auth=(KAGGLE_USERNAME, KAGGLE_KEY), stream=True, timeout=(10, 300)) as resp:(10, 300)은 연결 10초, 읽기 300초(5분)를 의미합니다. 대용량 파일의 경우 읽기 타임아웃을 더 늘려야 할 수 있습니다.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def download_file(slug: str, file_name: str, dst_path: str) -> bool: | |
| """Kaggle API로 특정 파일 다운로드 (스트리밍)""" | |
| owner, kernel = slug.split("/") | |
| url = f"https://www.kaggle.com/api/v1/kernels/output/{owner}/{kernel}?fileName={file_name}" | |
| with requests.get(url, auth=(KAGGLE_USERNAME, KAGGLE_KEY), stream=True) as resp: | |
| if resp.status_code != 200: | |
| print(f" [오류] 다운로드 실패: {file_name} ({resp.status_code})") | |
| return False | |
| total = int(resp.headers.get("content-length", 0)) | |
| downloaded = 0 | |
| with open(dst_path, "wb") as f: | |
| for chunk in resp.iter_content(chunk_size=8192): | |
| f.write(chunk) | |
| downloaded += len(chunk) | |
| actual = os.path.getsize(dst_path) | |
| if total > 0 and actual != total: | |
| print(f" [경고] {file_name} 크기 불일치: 예상 {total}, 실제 {actual}") | |
| return False | |
| return True | |
| def download_file(slug: str, file_name: str, dst_path: str) -> bool: | |
| """Kaggle API로 특정 파일 다운로드 (스트리밍)""" | |
| owner, kernel = slug.split("/") | |
| url = f"https://www.kaggle.com/api/v1/kernels/output/{owner}/{kernel}?fileName={file_name}" | |
| with requests.get(url, auth=(KAGGLE_USERNAME, KAGGLE_KEY), stream=True, timeout=(10, 300)) as resp: | |
| if resp.status_code != 200: | |
| print(f" [오류] 다운로드 실패: {file_name} ({resp.status_code})") | |
| return False | |
| total = int(resp.headers.get("content-length", 0)) | |
| downloaded = 0 | |
| with open(dst_path, "wb") as f: | |
| for chunk in resp.iter_content(chunk_size=8192): | |
| f.write(chunk) | |
| downloaded += len(chunk) | |
| actual = os.path.getsize(dst_path) | |
| if total > 0 and actual != total: | |
| print(f" [경고] {file_name} 크기 불일치: 예상 {total}, 실제 {actual}") | |
| return False | |
| return True |
🧰 Tools
🪛 Ruff (0.15.7)
[error] 77-77: Probable use of requests call without timeout
(S113)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/scripts/download_weights.py` around lines 72 - 93, The download_file
function currently calls requests.get without timeouts; add explicit connect and
read timeouts to avoid indefinite hangs (e.g., pass timeout=(10, 300) to
requests.get while keeping stream=True), and ensure the timeout tuple is used
when calling requests.get in download_file so large weight downloads can fail
fast on connect issues and allow a longer read window; keep the existing
response handling (status check, iter_content, size check) unchanged.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
AI/scripts/download_weights.py (4)
321-321: 불필요한 f-string 접두사가 있습니다.플레이스홀더가 없는 f-string입니다.
♻️ f-string 접두사 제거
- print(f"\n>> 완료") + print("\n>> 완료")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/download_weights.py` at line 321, The print statement at the end of AI/scripts/download_weights.py uses an f-string without any placeholders (print(f"\n>> 완료")); remove the unnecessary f prefix so the call uses a normal string literal (print("\n>> 완료")) to avoid misleading formatting usage and minor runtime overhead. Ensure you edit the lone print statement in the script accordingly.
102-103:HORIZONS변수가 사용되지 않습니다.
HORIZONS가 정의되어 있지만, 실제로는CONFIG['horizons']가 사용됩니다. 중복 정의로 인한 혼란을 피하기 위해 제거하거나CONFIG['horizons']를 이 변수로 대체하세요.♻️ 제안: 미사용 변수 제거
-HORIZONS = [1, 3, 5, 7] - -🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/download_weights.py` around lines 102 - 103, The HORIZONS constant is defined but never used while the code reads CONFIG['horizons'], causing duplication/confusion; either remove the unused HORIZONS declaration or make CONFIG read from HORIZONS instead. Update the top-level HORIZONS variable (remove it) and ensure all references use CONFIG['horizons'], or replace occurrences of CONFIG['horizons'] with HORIZONS and keep the single source of truth (adjust initialization code that builds CONFIG accordingly); target the HORIZONS definition and any places referencing CONFIG['horizons'] in download_weights.py (e.g., where horizons are iterated) to make the choice consistent.
189-193: 예외 처리 개선이 필요합니다.
Exception전체를 잡으면 예상치 못한 오류가 숨겨질 수 있습니다.raise ... from e를 사용하여 예외 체인을 유지하세요.♻️ 예외 처리 개선
try: df_t = add_technical_indicators(df_t) df_t = add_multi_timeframe_features(df_t) processed.append(df_t) - except Exception as e: + except (KeyError, ValueError, TypeError) as e: fail_count += 1 print(f"\n[경고] {ticker} 피처 계산 실패 ({fail_count}/{fail_limit}): {e}") if fail_count >= fail_limit: - raise RuntimeError(f"피처 계산 실패가 {fail_limit}개를 초과했습니다.") + raise RuntimeError(f"피처 계산 실패가 {fail_limit}개를 초과했습니다.") from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/download_weights.py` around lines 189 - 193, The except block currently catches broad Exception (hiding unexpected errors) and raises a new RuntimeError without preserving the original exception chain; update the try/except around the feature calculation to catch only expected exception types (e.g., ValueError, KeyError, IOError, or specific custom exceptions your feature calc can raise) instead of Exception, increment fail_count and log as before, and when failing the limit use "raise RuntimeError(f'피처 계산 실패가 {fail_limit}개를 초과했습니다.') from e" to preserve the original traceback; locate the try/except that references fail_count, fail_limit, and ticker and make these changes there.
217-225: Train/Val 분리 시 티커 셔플링이 없습니다.
tickers[-n_val:]로 마지막 20%를 검증셋으로 사용하는데, 티커 순서가 특정 기준(예: 상장일)으로 정렬되어 있다면 검증셋이 대표성을 잃을 수 있습니다. 의도적인 설계가 아니라면 셔플링을 고려하세요.♻️ 티커 셔플링 추가 제안
+ import random + random.seed(42) tickers = full_df['ticker'].unique() + tickers = list(tickers) + random.shuffle(tickers) n_val = max(1, int(len(tickers) * 0.2)) val_tickers = tickers[-n_val:] train_tickers = tickers[:-n_val]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AI/scripts/download_weights.py` around lines 217 - 225, The current train/validation split selects the last 20% of tickers (tickers, n_val, val_tickers, train_tickers), which can be biased if tickers are ordered; shuffle the unique tickers before slicing to ensure a representative random split and (optionally) use a fixed random seed for reproducibility; update the code where tickers is defined so you either call a shuffle (e.g., numpy.random.default_rng(seed).shuffle(tickers)) or obtain a randomized order (e.g., pandas.Series(tickers).sample(frac=1, random_state=seed).values) before computing n_val, val_tickers and train_tickers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AI/scripts/download_weights.py`:
- Around line 113-116: The warning for missing features currently prints missing
cols but leaves model input size fixed to len(FEATURE_COLUMNS), causing
dimension mismatch; update the pipeline so the model's enc_in matches the actual
used features by either having build_sequences return the actual feature
list/length (e.g., return available or its length) and have train() consume that
to set enc_in, or change train() to compute enc_in = len(available) (using the
same available variable computed where FEATURES_COLUMNS is filtered); adjust
references in build_sequences and train to use the actual feature count instead
of len(FEATURE_COLUMNS) so model input dims stay consistent with the data.
---
Nitpick comments:
In `@AI/scripts/download_weights.py`:
- Line 321: The print statement at the end of AI/scripts/download_weights.py
uses an f-string without any placeholders (print(f"\n>> 완료")); remove the
unnecessary f prefix so the call uses a normal string literal (print("\n>> 완료"))
to avoid misleading formatting usage and minor runtime overhead. Ensure you edit
the lone print statement in the script accordingly.
- Around line 102-103: The HORIZONS constant is defined but never used while the
code reads CONFIG['horizons'], causing duplication/confusion; either remove the
unused HORIZONS declaration or make CONFIG read from HORIZONS instead. Update
the top-level HORIZONS variable (remove it) and ensure all references use
CONFIG['horizons'], or replace occurrences of CONFIG['horizons'] with HORIZONS
and keep the single source of truth (adjust initialization code that builds
CONFIG accordingly); target the HORIZONS definition and any places referencing
CONFIG['horizons'] in download_weights.py (e.g., where horizons are iterated) to
make the choice consistent.
- Around line 189-193: The except block currently catches broad Exception
(hiding unexpected errors) and raises a new RuntimeError without preserving the
original exception chain; update the try/except around the feature calculation
to catch only expected exception types (e.g., ValueError, KeyError, IOError, or
specific custom exceptions your feature calc can raise) instead of Exception,
increment fail_count and log as before, and when failing the limit use "raise
RuntimeError(f'피처 계산 실패가 {fail_limit}개를 초과했습니다.') from e" to preserve the
original traceback; locate the try/except that references fail_count,
fail_limit, and ticker and make these changes there.
- Around line 217-225: The current train/validation split selects the last 20%
of tickers (tickers, n_val, val_tickers, train_tickers), which can be biased if
tickers are ordered; shuffle the unique tickers before slicing to ensure a
representative random split and (optionally) use a fixed random seed for
reproducibility; update the code where tickers is defined so you either call a
shuffle (e.g., numpy.random.default_rng(seed).shuffle(tickers)) or obtain a
randomized order (e.g., pandas.Series(tickers).sample(frac=1,
random_state=seed).values) before computing n_val, val_tickers and
train_tickers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8081828a-ec60-4353-851c-068ee62d6a37
📒 Files selected for processing (1)
AI/scripts/download_weights.py
AI/scripts/download_weights.py
Outdated
| available = [c for c in FEATURE_COLUMNS if c in full_df.columns] | ||
| missing = set(FEATURE_COLUMNS) - set(available) | ||
| if missing: | ||
| print(f"[경고] 누락된 피처: {missing}") |
There was a problem hiding this comment.
누락된 피처 경고 후 모델 차원 불일치 가능성이 있습니다.
피처가 누락되면 경고를 출력하고 available 컬럼만 사용하지만, train() 함수에서 모델의 enc_in은 len(FEATURE_COLUMNS)로 고정됩니다 (Line 249). 실제 피처 수와 모델 입력 차원이 달라지면 런타임 오류가 발생합니다.
🐛 수정 제안: 모델 입력 차원을 실제 피처 수와 일치시키기
build_sequences에서 실제 사용된 피처 수를 반환하거나, 모델 생성 시 len(available)을 사용하도록 수정하세요:
def build_sequences(full_df: pd.DataFrame, scaler: MinMaxScaler, fit_scaler: bool = True):
seq_len = CONFIG['seq_len']
horizons = CONFIG['horizons']
max_horizon = max(horizons)
available = [c for c in FEATURE_COLUMNS if c in full_df.columns]
missing = set(FEATURE_COLUMNS) - set(available)
if missing:
- print(f"[경고] 누락된 피처: {missing}")
+ raise ValueError(f"필수 피처 누락: {missing}")또는 train() 함수에서:
model = PatchTST_Model(
seq_len = CONFIG['seq_len'],
- enc_in = len(FEATURE_COLUMNS),
+ enc_in = X_train.shape[2], # 실제 피처 수 사용🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AI/scripts/download_weights.py` around lines 113 - 116, The warning for
missing features currently prints missing cols but leaves model input size fixed
to len(FEATURE_COLUMNS), causing dimension mismatch; update the pipeline so the
model's enc_in matches the actual used features by either having build_sequences
return the actual feature list/length (e.g., return available or its length) and
have train() consume that to set enc_in, or change train() to compute enc_in =
len(available) (using the same available variable computed where
FEATURES_COLUMNS is filtered); adjust references in build_sequences and train to
use the actual feature count instead of len(FEATURE_COLUMNS) so model input dims
stay consistent with the data.
…/SISC-IT/sisc-web into 20260326-#343-AI-kaggle-pipeline
학습기간 2022.08 부터 하니까 0.7정도로 낮아지는 모습 보임. 2023부터 테스트 하는걸로 변경 후 학습해볼 예정
규제 추가하여서, 이전 dropout channel 값으로 다시 테스트
캐글 환경 노트북만 사용하고 로컬은 따로 노트북 사용하지 않음.
개요
운영 서버 메모리 2GB 한계로 불가능했던 딥러닝 모델 학습을
Kaggle GPU를 활용해 자동화한 파이프라인 구축
Closes #343
변경사항
추가
수정
Summary by CodeRabbit
New Features
Bug Fixes
Chores